A Combinatorial Approach to Nonlocality and Contextuality 

Tobias Fritz, Anthony Leverrier, and Ana Belen Sainz 

Abstract. So far, most of the literature on (quantum) contextuality and the Kochen-Specker 
theorem seems either to concern particular examples of contextuality, or be considered as quantum 
logic. Here, we develop a general formalism for contextuality scenarios based on the combinatorics 
of hypergraphs which significantly refines a similar recent approach by Cabello, Severini and 
Winter (CSW). In contrast to CSW, we explicitly include the normalization of probabilities, which 
gives us a much finer control over the various sets of probabilistic models like classical, quantum 
and generalized probabilistic. In particular, our framework specializes to (quantum) nonlocality in 
the case of Bell scenarios, which arise very naturally from the Foulis-Randall product. In the spirit 
of CSW, we find close relationships to various invariants studied in combinatorics. The recently 
proposed Local Orthogonality Principle turns out to be a special case of a general principle for 
contextuality scenarios related to the Shannon capacity of graphs. Our results imply that it is 
dominated by a low level of the Navascucs-Pironio-Acm hierarchy of semidefinite programs, which 
we apply to contextuality scenarios. 

We hope that our approach may also serve as an introduction for combinatorialists to the 
subject of nonlocality and contextuality. Our conjectures on graphs whose Shannon capacity 
coincides with their independence number may be of particular interest. 
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1. Introduction 

Much effort has been devoted to understanding the mysteries of quantum theory. In particu- 
lar, this applies to the phenomena known as quantum nonlocality and quantum contextuality. 
Bell's theorem [Bel64] shows that no theory can make the same predictions as quantum theory, 
while jointly satisfying the properties of realism, locality and free will. This is often abbreviated 
to the statement that quantum theory displays nonlocality 1 . Similarly, the Kochen-Specker theo- 
rem [KS67] states that quantum theory is at variance with any attempt at assigning deterministic 
values to all observables in a way which would be consistent with the functional relationships be- 
tween these observables. Such an impossibility is generally known as contextuality, since it means 
that any potential "hidden" value of an observable will necessarily have to depend on the context 
in which it is probed. 

It is often stated that nonlocality is, at the mathematical level, a particular case of contextuality. 
However, it is rarely made explicit what this means precisely. Moreover, the study of contextuality 
so far often seems to have been concerned with particular examples of contextuality and "small" 
proofs of the Kochen-Specker theorem, while a general theory has hardly been developed. Some 
notable exceptions are the following: 

(a) The study of test spaces in quantum logic [CMWOO, Wil09], 

(b) Spekkens' work on measurement and preparation contextuality [Spe05, LSW11], 

(c) The graph-theoretic approach of Cabello, Severini and Winter [CSW10], 

(d) The sheaf-theoretic approach pioneered by Abramsky and Brandenburger [AB11]. 
Although test spaces are usually considered in the context of quantum logic and state spaces, 

they serve equally well for the study of contextuality, which is intimately related. This is one of 
our main themes: a test space can be considered as a contextuality scenario, and this is the 
term we use. As in [CSW10], we take a contextuality scenario to be a specification of a collection 
of measurements which says how many outcomes each measurement has and which measurements 
have which outcomes in common. We show how the n-party Bell scenario with k m-outcome 
measurements per party can be regarded as a natural contextuality scenarios which one obtains by 
taking the n-fold Foulis-Randall product of the single-party scenario which describes k indepen- 
dent m-outcome measurements. More generally, the Foulis-Randall product is a product operation 
on contextuality scenarios which naturally incorporates the no-signaling condition. Also, we prove 
a combinatorial characterization of extremal probabilistic models. In the Bell scenario case, these 
are the extremal no-signaling boxes. 

Our second main theme is to relate, again inspired by CSW, contextuality scenarios and their 
probabilistic models to graph theory and invariants of graphs like the independence number, 
Lovasz number, fractional packing number. Our approach differs significantly from CSWs 
in two important respects. First, and most importantly, we explicitly take into account the nor- 
malization of probability from the very beginning. In contrast to this, CSW were working with 
subnormalized probabilities, which seems necessary in order to derive their relations to graph- 
theoretic invariants. We show that such relations still exist even if one retains the normalization 
of probability. This gives us much finer quantitative information and control about contextuality. 
Second, while CSW study the maximal values of contextuality inequalities for classical, quantum, 
and general probabilistic models, we consider the sets of classical, quantum, and general proba- 
bilistic models themselves as the primary objects. We believe that this is a more natural thing to 



This terminology can be confusing, since all known interactions in nature are local [Haa96, Zeh06] , in a different 
sense of the term. 
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Figure 1. Chain of inclusions between sets of probabilistic models and corre- 
sponding inequalities between graph invariants. We suspect that all inclusions in 
the first row are strict for some H . 



do, since the actual quantities gathered e.g. from an experiment are outcome probabilities rather 
than coefficients of some inequality. Figure 1 summarizes the sets of probabilistic models that we 
consider together with their relations to invariants of graphs. The classical C corresponds to models 
which can be described in terms of noncontextual hidden variables; Q is the quantum set; the Q n 
family comes from a hierarchy of semidefinite programs characterizing Q; the general probabilistic 
set Q contains all models that all conceivable models satisfying the normalization of probability; 
finally, the family of sets ^S" 1 arises from our third main theme. 

This third main theme is the concept of Local Orthogonality (LO) which was recently 
introduced in [FSA + 12] as an information-theoretic principle delimiting the set of quantum corre- 
lations in Bell scenarios. We show how LO naturally arises in our formalism as a special case of a 
previously studied concept called Consistent Exclusivity (CE) [Henl2] or Specker's Princi- 
ple [Cabl2c]. CE builds on the observation that compatibility of quantum observables is a binary 
property determined by pairwise commutativity. It can be applied both on the single-copy level 
of a scenario, in which case we write CE 1 , and on the many-copy level when the same system is 
distributed among any number of parties, for which we write CE 00 . This parallels the distinction 
between LO 1 and LO 00 that we made in [FSA+12]. While CE 1 relates to the independence num- 
ber of a graph, CE™ corresponds to the Shannon capacity (in the sense of graph theory). This 
allows us to answer some open questions about LO°°, though some of our proofs depend on some 
new conjectures on the Shannon capacity of graphs. In particular, we show that LO 00 , and more 
generally CE 00 , does not characterize quantum models. In fact, it is satisfied for every probabilistic 
model which lies in Q±, which means that a certain semidefinite program is solvable, which (often 
strictly) contains the quantum set Q. Moreover, at least on some scenarios, there are probabilistic 
models which satisfy CE 00 , but do not even lie in Qi . We also ask whether the set of probabilistic 
models satisfying CE 00 is convex, and whether activation of CE 00 violations is possible. These are 
the questions which reduce to conjectures on the Shannon capacity of graphs. Figure 12 links to 
all our conjectures and our proofs of implications between them. 

1.1. Structure and contents of this paper. We begin in Section 2 by introducing test 
spaces as our notion of contextuality scenario. Later (in Section 3), we will see that every Bell 
scenario is a contextuality scenario. We continue in Section 2 by defining probabilistic models 
on a contextuality scenario; e.g. for a Bell scenario, these are the no-signaling boxes. We give an 
abstract characterization of extremal probabilistic models. 

In Section 3, we consider products of contextuality scenarios corresponding to simultaneous 
measurements on spatially separated systems. We find the relevant product operation to be the 
Foulis-Randall product of test spaces. This product guarantees the no-signaling property for 
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probabilistic models on the product scenario by, seemingly paradoxically, incorporating measure- 
ments with communication. Figure 7 displays the CHSH scenario [CHSH69] as a contextuality 
scenario. 

In Section 4, we study classical models on contextuality scenarios. These are precisely those 
probabilistic models that can occur in a world described by noncontextual hidden variables. We 
introduce the non-orthogonality graph of a contextuality scenario and show how its weighted 
fractional packing number detects the (non-)classicality of a probabilistic model. 

In Section 5, we consider quantum models. We show how a quantum model of a product of 
contextuality scenarios arises from quantum representations of the factors such that every operator 
associated to one factor commutes with an operator associated to another factor. 

In Section 6, we show how to formulate a hierarchy of semidefinite programs characteriz- 
ing quantum models for contextuality scenarios. This can be regarded either as a generalization 
of the original hierarchy for quantum correlations in Bell scenarios [NPA07,NPA08] or as a special 
case of the general hierarchy for noncommutative polynomial optimization [PNA10]. We relate 
the first level of this hierarchy to the weighted Lovasz number of the non-orthogonality graph. 

In Section 7, we consider the principle of Local Orthogonality (LO) introduced in [FSA+12] and 
show in which sense it arises from Consistent Exclusivity (CE) [Henl2]. We show how CE relates 
to the weighted independence number and the weighted Shannon capacity of the non-orthogonality 
graph. It turns out that the principle, even when applied on the level of distributed copies as CE 00 , 
is weaker than the first level of the semidefinite hierarchy. We relate the problem of equality and 
of several other questions about CE 00 , like convexity and activation, to open problems in graph 
theory. If the non-orthogonality graph is a perfect graph, which frequently happens, then every 
probabilistic model satisfying CE 1 is classical and no interesting contextuality is possible in the given 
scenario. The strong perfect graph theorem then implies that a scenario can display (quantum) 
contextuality only if it has a certain odd cycle or odd anti-cycle structure. 

In Section 8, we study the complexity of various decision problems on contextuality scenarios. 
Our "inverse sandwich conjecture" 8.3.3 is an undecidability statement whose proof would have 
significant repercussions in C*-algebra theory and quantum logic. 

In Section 9, we discuss some further examples of contextuality scenarios and the various sets 
of probabilistic models associated to them, including a prescription for translating scenarios with 
subnormalization as in [CSW10] into our framework. 

In Appendix A, we discuss the graph theory relevant for the main text. In addition to serving 
as a convenient reference, the main purpose of this is the introduction of our Conjectures A. 2.1 on 
properties of graphs whose Shannon capacity coincides with their independence number. 

Finally, in Appendix B, we discuss how our approach, based on hypergraphs in which the ver- 
tices represent measurement outcomes, relates to the one of [AB11], which is based on hypergraphs 
in which vertices represent observables. We find that our approach naturally contains the other 
one. 



2. Contextuality scenarios and their probabilistic models 

2.1. Motivation: the Kochen-Specker theorem. Cabello et al. [CEGA96,Cab08] showed 
that one can find 18 vectors in C 4 labeling the vertices of Figure 2 such that the four vectors asso- 
ciated to each one of the 9 edges form an orthonormal basis. Together with the observation that 
there is no consistent way to label the vertices by {0, 1} such that every edge contains exactly one 
vertex labeled by 1, this is a proof of the Kochen-Specker theorem for C 4 . 
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Figure 2. The contextuality scenario H^s proving the Kochen-Specker theorem 
[CEGA96,Cab08]. 

Now what does the hypergraph of Figure 2 represent, operationally? This is what we would 
like to consider next. 

2.2. General definition. Since each edge of Figure 2 stands for a basis in C 4 , we may think 
of an edge as representing a 4-outcome measurement. Now every vertex occurs in two different such 
edges; in other words, the measurements may share outcomes. The assumption of measurement 
noncontextuality [Spe05] means that any reasonable theory should represent a shared outcome 
as a function from states to probabilities which should not depend on the particular measurement 
in which the outcome occurs. 

Abstracting from this particular example to a general definition of contextuality scenario 
means that we need to consider a mathematical structure containing a set of vertices, representing 
outcomes, and a collection of subsets of the vertices, representing measurements. Mathematically 
this is a hypergraph H with vertices V(H) and edges E(H). We therefore arrive at: 

Definition 2.2.1. A contextuality scenario is a hypergraph H such that no edge contains 
another one: 

d, e 2 e E(H), e x <^e 2 =*• e x = e 2 , (2.1) 

and UceE(H) e = V ( H )- 

The reason for postulating (2.1) is related to the normalization of probability: if all outcomes 
of a measurement e\ are also outcomes of a measurement e 2 , then the additional outcomes of 
e 2 necessarily have probability and can therefore be disregarded. In the literature on graph 
theory, hypergraphs satisfying (2.1) are known as Sperner families [Eng97], or clutters [EF70]. 
The condition UeEB(if) e = sml ply states that each outcome should occur in at least one 

measurement. In the following, we will generally prefer the term vertex over outcome and edge 
over measurement, while keeping in mind that the latter is the physical interpretation of the 
former, respectively. 
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Figure 3. The triangle scenario A. 

In a typical scenario, the hypergraph H is finite, meaning that V(H) is a finite set, and this is 
the only case that we want to consider. 

This kind of definition has been considered before in the literature on contextuality and 
the Kochen-Specker theorem, e.g. in [TkaOO, PMMF10], and coincides with the notion of test 
space [Wil09] which had been introduced in [FR72, RF73] as (generalized) sample space. 
In particular, the Greechie diagrams [Gre71,ST96] of quantum logic can all be regarded as 
contextuality scenarios. 

On the other hand, Definition 2.2.1 differs from the formalisms proposed in [AB11] and [CF12, 
FC12]. These works also provide a formalization of contextuality phenomena in terms of hyper- 
graphs, but the vertices of the hypergraph represent observables rather than outcomes, while the 
edges stand for (maximal) jointly measurable sets of observables. See Appendix B for a more 
detailed discussion. 

2.3. Probabilistic models. In a concrete physical situation, every outcome carries a proba- 
bility, and the sum of these outcome probabilities over all outcomes in a measurement is 1. This 
gives: 

Definition 2.3.1. Let H be a contextuality scenario. A probabilistic model on H is an 

assignment p : V(H) — » [0, 1] of a probability p{v) to each vertex v e V{H) such that 

Y J p{v) = l VeeE(H). (2.2) 

It is important to keep in mind that each p(v) is actually a conditional probability: it stands 
for the probability of getting the outcome v given that a measurement e 3 v is conducted. 

The set of all probabilistic models on H is a convex subset of possibly empty, which 

we denote by G(H). This notation is supposed to suggest the reading "general probabilistic" in 
the sense of general probabilistic theories [Bar07]. In the terminology of test spaces [Wil08], 
G(H) is the set of states over H; unfortunately, the term "probabilistic model" has a different 
meaning there. 

We now turn to some basic examples other than Figure 2. Those mainly interested in Bell 
scenarios will become satisfied in Section 3. 

Example 2.3.2. Figure 3 displays the triangle scenario A. Its only probabilistic model is 
p(vi) = p{v2) = p{v%) = |, since this is the only solution to the system of normalization equations 

p(vt) + p{v 2 ) = 1, p(v 2 ) + p(v 3 ) = 1, p(vt) + p(v 3 ) = 1. 

See [LSW11] for more on this scenario and its unique probabilistic model. 
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Figure 4. Example of a scenario Hq without any probabilistic model: G(Hq) 
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Figure 5. The contextuality scenario -Bi,fc, m , a "Bell scenario" with only one party. 



Contextuality scenarios having a unique probabilistic model, like A does, will be of particular 
importance in Theorem 2.4.3. 

Example 2.3.3. Figure 4 displays a contextuality scenario Hq with Q(Hq) = 0. Indeed, each 
of the outer triangles corresponds to a copy of the scenario A of Figure 3 and admits a unique 
probabilistic model where each vertex is assigned a probability 1/2. This is incompatible with 
the three-outcome measurement depicted in orange which imposes that the probabilities associated 
with the three corresponding vertices should sum to 1. 



Example 2.3.4. Figure 5 displays the contextuality scenario defined by k measurements with m 
outcomes each, such that no two measurements share any outcome. Such scenarios are particularly 
relevant for describing "box" experiments where an observer can press one of k buttons and record 
the corresponding measurement outcome. 

Further examples will be discussed in Section 9. 

For fixed H, the set Q{H) c R y W is defined in terms of finitely many linear inequalities with 
rational coefficients. Therefore, it is a convex polytope with rational vertices. A natural question 
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now is, which polytopes with rational vertices can arise in this way? This has been answered by 
Shultz: 

Theorem 2.3.5 ([Shu74]). Let P c R d be a polytope with vertices in Q d . Then there exists a 
contextuality scenario Hp such that Q{Hp) is affinely isomorphic to P. 

Surprisingly, the combinatorial structure of some polytopes is such that they cannot be repre- 
sented with rational coordinates only [Zie95, Ex. 6.21]. 

2.4. Characterizing extremal probabilistic models. Since G(H) is a convex polytope, a 
natural question is: what are its extreme points? For example, for the CHSH scenario £2,2, 2 that 
we will discuss in Section 3, Q(B2,2, 2) is the no-signaling polytope, and hence its extreme points 
are the 16 deterministic boxes together with the 8 variants of the PR-box. 

In this subsection, we would like to give an abstract characterization of these extremal models 
which applies to every contextuality scenario. 

Definition 2.4.1. Let H be a contextuality scenario. We say that a non-empty set W Q V(H) 
induces a subscenario if e\ n W Q C2 n W implies that e\ = e^ for all e±,e2 e E(H). In this 
case, Hw with 



is the subscenario induced by W. 

The assumption on W guarantees that Hw is also a contextuality scenario. In particular, it 
implies that e n W for all e e E(H), meaning that W is a transversal (or hitting set) of 
the hypergraph H . The subscenario Hw is a subclutter of H . 

In words: Hw is constructed by dropping all vertices which do not belong to W and restricting 
all edges accordingly. In doing this, the subset W Q V(H) is assumed to guarantee that no two 
different edges have equal restrictions or one restriction containing the other. 

Intuitively, Hw is the same scenario as H, except that all outcomes not in W have been 
forbidden. In particular, every probabilistic model pw on Hw extends to H by setting 



In this case, we say that p is the extension of pw to H . 

We have implicitly used induced subscenarios in [FSA + 12] when considering the graphs on 
"possible events" . 

Lemma 2.4.2. If Hw is an induced subscenario of H and Hw,w is an induced subscenario of 
Hw, then Hw,w is also an induced subscenario of H . 

Proof. Clear. □ 
Our main result in this section is this: 

Theorem 2.4.3. p e G(H) is extremal if and only if it is the extension of pw 6 Q{Hw) from 
some induced subscenario Hw which has pw as its unique probabilistic model. 

Proof. If H has a unique probabilistic model, i.e. if Q(H) = {p}, then there is nothing to 
prove. 



V(H W ) = W- 



E(H W ) = {enW : e e E(H)} 
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Otherwise, the extreme points of G(H) are precisely the extreme points of the facets of G(H). 
Since G(H) is defined by 

p(v) > Vue V(H), ^ P( v ) = 1 Vee E ( H ) ' 

for every facet of G(H) there exists some v e V such that the facet contains exactly those p e G(H) 
with p(v) = 0. We fix such a v and set 

= {w e V(#) | 3 p e G{H) s.t. = a p{w) * 0}. 

In particular, v e W\ and W induces a subscenario -ffiy. By construction, G{Hw) is the facet 
of G(H) defined by p(v) = 0. 

The assertion then follows by repeatedly applying this process to the induced subscenarios 
constructed in this way; the lemma guarantees that one obtains an induced subscenario of the 
original H at each step. This recursion necessarily ends with a scenario which admits a unique 
probabilistic model, since the dimension of G{H) decreases by 1 in each step. □ 

As the proof shows, a similar statement also holds for all faces of G(H): they all are of the form 
G{H W ) for some induced subscenario H w . 

We conclude that an extreme point p e G(H) is uniquely determined by the set of vertices 
W = {v e V(H) | p(v) ^ 0}, which induces a subscenario Hw with a unique probabilistic model 
corresponding to forgetting the zeros of p. 

The deterministic models of Definition 4.1.1 are a special case of this. Clearly, every determin- 
istic model is an extreme point of G(H). In terms of Theorem 2.4.3, p is deterministic if and only if 
each measurement in the associated Hw has only one outcome, i.e. if every vertex in Hw is its own 
singleton edge. Those extreme points which are not deterministic are the maximally contextual 
models in the scenario H. 

3. Products of contextuality scenarios and the no-signaling property 

3.1. Products. Imagine two spatially separated or spacelike separated parties Alice and Bob. 
Alice is assumed to operate in a contextuality scenario Ha, while Bob is taken to operate in a 
contextuality scenario Hb- Now the two parties can apply simultaneous measurements on their 
respective systems and will then obtain simultaneous outcomes. In general, the two systems can be 
correlated, which may lead to correlations between the outcomes. The question now is, what is the 
contextuality scenario describing this situation? How do two contextuality scenarios combine into 
a joint one? This question was first answered by Foulis and Randall [FR81], who noticed that the 
answer is nontrivial. When combining two scenarios into one, we speak of a product. 

Clearly, the set of outcomes of a product scenario should be the cartesian product of the sets 
of outcomes, so that a joint outcome simply is the same thing as an outcome of Ha together with 
an outcome of Hb- Also, every edge on Ha should combine with any edge on Hb into a joint edge. 
So, naively, one would define the product scenario like this: 

Definition 3.1.1. Let Ha and Hb be contextuality scenarios. The direct product is the 

scenario Ha x Hb with 

V{H A x H B ) = V{H A ) x V(H b ), E{H a x H b ) = E{H A ) x E{H B ). 

Now, in any actually observed model, Bob's outcome probabilities should not depend on Alice's 
choice of measurement and vice versa: 
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(a) The contextuality scenario T. (b) The direct product T x T 

equipped with a (deterministic) 
probabilistic model. 

Figure 6. A contextuality scenario T and a probabilistic model on T x T not 
satisfying the no-signaling condition. 



Definition 3.1.2. A probabilistic model p e Q(H A x Hb) is no-signaling if 

(a) For every w e V(Hb), 



vee' 



for all e, e' e E{Ha); 
(b) For every v e V(Ha), 



for all e, e' e E(H B )- 



This coincides with [BL09, Defn. 8] and [BFRW05, Defn. 3.2], although the terminology is 
different. 

Now the obvious question is, is every p e Q(Ha x Hb) no-signaling? Unfortunately, this is not 
the case; Figure 6 provides the arguably simplest example. It displays a deterministic model where 
Alice (vertical) knows with certainty which measurement was performed by Bob (horizontal). It is 
easy to come up with other examples for virtually any non-trivial scenarios Ha and Hb- 

While one solution for this problem is to simply restrict to no-signaling models by fiat [BL09] , a 
conceptually much more appealing solution is to use the "right" product of contextuality scenarios: 

Definition 3.1.3 ([FR81]). The Foulis- Randall product (FR-product) is the scenario 
Ha ® Hb with 

V(H A <g> H B ) = V{H A ) x V(H B ), E(H A <g> H B ) = E A ^ B u E a ^b 

where 

E a ^b = f \ (J {a} x f(a) : e A e E A , f : e A -» E B \ , 

Intuitively, an element of E a ^b is the following: first, an edge e A e E(H A ) representing 
a measurement conducted by Alice; second, a function / : e A — > E(Hb) which determines the 



(3.1) 



12 



TOBIAS FRITZ, ANTHONY LEVERRIER, AND ANA BELEN SAINZ 



subsequent measurement of Bob as a function of Alice's outcome. This function / maps each 
vertex a e to an edge /(a) e Eb- Similarly for Ea<-b, where we think of Bob measuring first 
and communicating his outcome to Alice, who then chooses her measurement as a function of Bob's 
outcome. Both possibilities are feasible ways to operate on the joint system and therefore should 
be considered as measurements conductible on the joint system. In this way, an edge in Ha ® Hb 
is an element of Ea^b, Ea^b, or of both sets. For example, Figure 7(f) displays the FR-product 
of 7(a) with 7(b), which is another copy of 7(a). Ea^b contains the edges of Figure 7(c) and 7(d), 
while Eb^a consists of 7(c) and 7(e). 

Since Ha ® Hb contains the same vertices as Ha x Hb but more edges, we have an inclusion 
Q{Ha®Hb) £ Q{Ha x Hb)- The following observation is due to Barnum, Fuchs, Renes and Wilce: 

Proposition 3.1.4 ([BFRW05, Cor. 3.5]). G(H A ®H B ) c g(H A x H B ) is exactly the set of 
no-signaling models. 

It is in this sense that Ha ® Hb, in contrast to Ha x Hb, automatically incorporates the 
no-signaling requirement of special relativity. This is the reason why we regard it as the "right" 
product of contextuality scenarios. 

Both the inclusion G(Ha ® H b ) S Q(Ha x H b ) and Proposition 3.1.4 can intuitively be 
understood in terms of the duality between states and effects [D'AIO]: restricting the models 
considered to the no-signaling ones makes more measurements well-defined and in particular allows 
measurements in which the parties use signaling; on the other hand, allowing measurements in 
which the parties use signaling is possible only if the system itself, on which the measurements 
are conducted, does not have internal signaling. Compare Wilce [WI108], who prefers the term 
influence-free over no-signaling. 

One can also do all this for the case of unidirectional no-signaling: defining a product of Ha 
and Hb by only using the Ea^b of (3.1) gives probabilistic models which are no-signaling from 
Bob to Alice. See [BFRW05] for more details. The resulting product contextuality scenario may 
be interpreted as describing a temporal succession of operating on Hb after having operated on 
H A - 

Given two contextuality scenarios Ha and Hb together with probabilistic models 

PA eg(H A ), PB eg(H B ), 

there should exist a probabilistic model pa ®Pb ° n Ha ® Hb having the interpretation of placing 
physical systems behaving as pa and pb "side by side" so that measurements can be conducted on 
both in parallel, revealing no correlations between the two systems, but independent statistics. To 
this end, one should obviously define 

Pa®pb ■ V(H A ) x V(H B ) — > [0,1], (va,v b ) i-* pa(va)pb(v b )- 

Proposition 3.1.5. This pa® pb is a probabilistic model on Ha®Hb- 

Proof. We need to prove that YiveePA® Pb(v) = 1 for each edge e e E(Ha ®H b )- Without 
loss of generality, we can assume e e Ea^b, i-e. e = Uaee^ } x /( a ) f° r some e A e Ea and some 
/ : eA *— > Eb, which maps each vertex in eA to an edge in Hb- Therefore, 

YjPA®Pb{v) = Yj Yj PA{a)PB{b)= Y P A ( a ) Xj Pe(b) = 2j P A ( a ) ' 1 = 1 ' 

vGe aEe A 6s/(o) aEe A bef(a) aee A 



since ps and pa are probabilistic models on Hb and Ha, respectively. 



□ 
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(c) Simultaneous measurements. 



(d) Bob's measurement choice depends on Al- 
ice's outcome. 
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(e) Alice's measurement choice depends on (f) Foulis-Randall product: the CHSH 
Bob's outcome. scenario B-2,2,2 = -Bi,2,2 ®-Bi,2,2- 
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(g) Alternative drawing of 7(f) after re- 
arranging the vertices. 



Figure 7. Construction of the CHSH scenario £?2,2.2 as a Foulis-Randall product 

-^2,2,2 = -Bl,2,2 ®-Bl,2,2- 
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We write Q(Ha) ® G(H B ) for the set of all probabilistic models of the form pa®Pb- We have 
just shown that Q(H A )®Q(H B ) c Q(H A ®H B ). 

Remark 3.1.6. Often Q(Ha ® H b ) is strictly bigger than the convex hull of Q(Ha) ® Q(H B ). 
For example for the Bell scenario -82,2,2 = -61,2,2 ® -61,2,2 discussed below, the Popescu-Rohrlich 
box [Tsi93, eq. (1.11)], [PR94] is an element of £(-61,2,2 ® -61,2,2), but does not lie in the convex 

huU of G(Bl t 2,2)®g(B 1} 2, 2 ). 

3.2. Products of more than two scenarios. It is not difficult to check that the Foulis- 
Randall product "(x)" is a commutative binary operation on contextuality scenarios. But now what 
about having more than two parties which operate in their respective scenarios simultaneously? 

Given three scenarios Ha, H b , He, we can first form the product Ha ® H B and then the 
product of this with He, which gives (Ha ® H B ) ® He; unraveling the definitions reveals that an 
edge in this scenario pertains to one of the four sets 

E(A^B)^C, E(A^B)^C, E( A ^B)^C, E( A ^B)^C (3-2) 

where E(a^b)^c is defined to be the collection of all sets of the form 

{ (a, b, c) e V{H A ) x V(H B ) x V(H C ) \aee A ,be /(a), c e g(a, b) } (3.3) 

where e A e E(H A ) is fixed and / : V(H A ) E(H B ) and g : V(H A ) x V{H B ) E(H C ) are any 
functions, and similarly for the other three. In this way, every one of the four sets (3.2) contains all 
those measurements associated to a certain ordering of the three parties; these four orderings are 

A&B&C, B&A&C, C&A&B, C A B & A. (3.4) 
On the other hand, the bracketing Ha ® (H B ® He) is based in a similar way on four sets of edges 

Ea^(b^c), Ea^(b^c), E A ->(b^c), E a «-(b^c) (3-5) 
which represent the time orderings 

A®B®C, B®C®A, A®C®B, C ® B ® A. (3.6) 

Now these four time orderings are different from (3.4); therefore, in general, Ha ® (H B <S) He) 
contains different edges than (Ha® H B ) <g> He' the Foulis-Randall product is not associative! 

Nevertheless, we will frequently want to work with a Foulis-Randall product of more than 
two factors. One way around the problem of non-associativity is to define an n-fold product 
-ff-Ai (x) ... (x) Ha„ (no brackets) by taking all possible time orderings of the parties A\, . . . , A n into 
account in the sense that an edge in A\ (g) . . . (g) A n is defined by the following data: 

(a) a permutation a of the parties such that party a(t) measures at timestep t £ {1, . . . , n}; 

(b) for every timestep t, a function f t : V(H a ix)) x ... x V(H a ( t _i)) —> E(H a n\) (possibly 
constant) which specifies the measurement of party a(t) as a function of all the previous 
outcomes of parties er(l), . . . , a(t— 1). (For party er(l), this is a function without arguments, 
i.e. a constant.) 

The edge associated to this data is then given by 

{ (at, . . . ,a n ) | a CT -i (i ) e f a -n i) (a lj{1) , . . ■ ,a aia -i (i) _ 1) ) for 1 < i< n) c V(H ± ) x . . . x V(H„). 

For example, in the case of three parties A, B, C, the scenario Ha®H b ®Hc turns out to comprise 
precisely (3.2) and (3.5). In this way, we obtain an n-ary Foulis-Randall product for any n e N. 



A COMBINATORIAL APPROACH TO NONLOCALITY AND CONTEXTUALITY 



15 



However, for n scenarios H Al , ■ ■ ■ , H An , there are many other ways to form an n-fold product: 
any consistent way of introducing brackets in the expression 

H Al ® ■ • ■ <S> H An 

defines an n-fold Foulis-Randall product. For example, 

(H Al $ H M ®H A3 )® H Ai (x> H As 

represents formation of the ternary product H Al ® H A2 (g> H Aa , followed by taking the ternary 
product of the resulting scenario with H Al and H A5 . 

Now which one of these products is the "right" one which captures the physical intuition of 
conducting measurements on independent systems? It turns out that all of these products do the 
job; while we officially stick to the version without any brackets, all our proofs will also be valid for 
any non-trivial bracketing; in fact, we suspect that these different products are related via notions 
like perspective [Wil05]. In any case, the rt-ary product without brackets is "maximal" in the sense 
that it contains all the edges of all other products. In physical terms, it allows application of the 
parties' measurements in any temporal order. 

3.3. Bell scenarios. We now explain how Bell scenarios are examples of contextuality sce- 
narios. The Bell scenario B n ^, m consists of n parties having access to k measurements each, each 
of which has m possible outcomes. At the single-party level, the outcomes form a contextuality 
scenario i?i.fc. m as depicted in Figure 5. As contextuality scenarios, we define 

def 

B n ,k,m = B\ t k,m ® ■ ■ ■ ® Bi t k,m, (3-7) 



and we will see in the following how this leads to the usual concepts studied as "nonlocality" . 

It is straightforward to generalize this definition and all our upcoming results to scenarios where 
the parties have access to different numbers of measurements and outcomes per measurement, but 
we will not consider this explicitly. 

Example 3.3.1 (The CHSH scenario). Figure 7 illustrate how .82,2,2 arises as £1,2,2 <8> -81,2,2- 
A vertex ab\xy represents the event where Alice (resp. Bob) chooses measurement x (resp. y) and 
obtains output a (resp. b). In this scenario, the edges are as follows: 

• For simultaneous measurements, the / of (3.1) are constant, and the measurements are as 
in Figure 7(c): 

{00|00, 01|00, 10|00, 11|00}, 
{00|01, OljOl, 10|01, 11|01}, 
{00|10, 01|10, lOjlO, 11|10}, 
{00|11, 01| 11, 10| 11, 11|11}. 

• If Alice measures first and Bob's choice of setting depends on her outcome, then the 
events are of the form ab\xf(a), where / is not a constant. Thus we have two possibilities: 
/(a) = a or /(a) = 1 — a. In the first case we obtain the edges 

{00|00, 01|00, 10|01, 11|01}, 
{00|10, 01|10, 10[11, 11|11}, 
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and in the second case, 

{00|01, 01|01, 10|00, 11|00}, 
{00|11, 01|11, 10]10, 11|10}. 

These are the red edges in Figures 7(d) and 7(f), 7(g). 
• Similarly, Bob measuring first with Alice's subsequent choice of setting depending on his 
outcome gives rise to the edges 

{00|00, 01|10, 10|00, 11|10}, 

{00|01, 01| 11, lOjOl, 11|11}, 

{00|10, 01|00, lOjlO, 11|00}, 

{00|11, 01|01, 10| 11, 11|01}. 

These are the green edges in Figures 7(e) and 7(f), 7(g). 

Proposition 3.3.2. LetB n ^, m be a Bell scenario. ThenQ{B n k. m ) is the standard no- signaling 
polytope containing all no-signaling boxes of type (n,k,m). 

Proof. While this follows from an application of the multipartite version of Proposition 3.1.4, 
we believe that an independent proof is more instructive. 
We identify the vertices of i? n ,fc,m with the events 

ai . . . a n \xi . . . x n , Oj 6 {1, . . . , m}, Xi 6 {1, . . . , k) 

in the usual Bell scenario notation. 

We show first that a non-signaling box of type (n, fc, m) is indeed a probabilistic model on 
B n ,k,m- Such a box is an assignment of a probability p{a\x) to each event a\x such that the no- 
signaling equations 

^]p(ai . . . a n \xi . . . x n ) = ^P( a i ■■■a n \x\...x' i ... x n ) (3.8) 

ai ai 

hold (where the right-hand side is the same except that the setting Xi has been replaced by some 
other setting x[), as well as the normalization condition 

2j p{a x . . .a n \x x . . .x n ) = 1. (3.9) 

ai , . ...a n 

Now we consider any edge in the scenario B n ^^ m . Without loss of generality, we take the underlying 
total order of the parties to be the numerical one, so that the temporal order of the parties' 
measurements is simply 1, . . . , n. The settings used by the parties are then determined by functions 
Xi = /t(oi, • ■ • , ai-i), and we need to consider 

2 p(a x . . . a„|/i() . . . /„(ai, . . . a„_i)), 

ai , . ...a n 

where x\ = /i() is a function without arguments, i.e. a constant. Since the vector of settings does 
not depend on a„, the no-signaling equations imply that the last function f n {a\, . . . ,a„_i) can be 
replaced by an arbitrary constant setting x n without changing the value of the sum. After applying 
this modification, the vector of settings does not depend on a„_i, and then the setting of party 
n — 1 can be taken to be some fixed x„_i. Applying this procedure repeatedly eventually replaces 
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all functions fi(a\, . . . , a,i—i) by constant settings x^. Then the normalization equation implies that 
the sum has the value 1, as has been claimed. 

Conversely, suppose that p is a probabilistic model on B n ^^ m . Then p satisfies the normalization 
equation since taking all functions /j to be constants Xi gives precisely (3.9). In order to prove the 
no-signaling equation, we fix arbitrary outputs bj and choose all functions to be constants fj = Xj , 
except for 

. . _ J x n if a,j = bj for all j < n, 

/„(ai, . . . , a n _ij = | ^ otherwise; 

which gives the equation 

2jp(bi...b n -ia n \xi...x n ) + 2_ i J] p(ai . . . a n \xi . . . x' n ) = 1. 

a n a n (ai,...,a„_i)#(6i,...,6„_i) 

Upon combining this with the already proven normalization equation 

2_ i p(bi...b n - 1 a n \xi...x' n ) + 2 p(a\...a n \x\...x' n ) = 1. 

Q ™ Q ™ (ai,...,a„_i)#(6i,...,6„_i) 

we obtain (3.8) with i = n and &i . . . 6„-i in place of ai . . . a n -i- The other no-signaling equations 
can be obtained in the same way, choosing different orders of the parties. □ 

In particular, this proof shows explicitly how the non-trivial edges occurring in the definition 
of "®" give rise to the no-signaling property. 

4. Classical models 

For each scenario H, one can define several relevant subsets of the set of G(H). In the following, 
we will define these and study some of their properties in some detail, starting with set of classical 
models C{H). We will use the B n ^, m as a "running example" illustrating that B n ^, m indeed 
behaves exactly as one would expect from the Bell scenario of type (n, k, m). 

4.1. Definition. What we mean by classical here comprises the idea of hidden variables as 
they occur in results of Bell [Bel64], Fine [Fin82] and Kochen-Specker [KS67]. 

Definition 4.1.1. Let H be a contextuality scenario. 

(a) A probabilistic model p G G(H) is deterministic if p(v) e {0, 1} for all v e V(H). 

(b) A probabilistic model p e G(H) is classical if it is a convex combination of deterministic 
ones. 

Following Fine [Fin82] and certain refinements of his results to considerations of contextual- 
ity [LSW11, Thm. 6], [AB11, Thm. 8.1], we note that classical models are precisely those which 
can be explained in terms of noncontextual hidden variables. 

Since, for finite H , there are only finitely many deterministic models, the set of classical models 
is a polytope. We denote this polytope by C(H). 

Example 4.1.2 ([CEGA96]). For H the contextuality scenario of Figure 2, we claim that 
C(H) = 0. To see this, let Vi be the set of vertices to which a given deterministic model assigns 
a 1. Since the set V\ is required to intersect every edge in precisely one vertex, and every vertex 
appears in precisely two edges, 2|Vt| needs to be equal to the number of edges. Since the latter is 
odd, we conclude that this is impossible, so that no deterministic model exists, which means that 
C{H) = 0. See [AB11, Sec. 7.1] for a very general version of this argument. 
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Remark 4.1.3. As we just exemplified, a deterministic model p is determined by the set of 
vertices 

V 1 = {v e V \p{v) = 1}. (4.1) 
By definition of deterministic model, V\ has the property that it intersects every edge in exactly 
one vertex: V\ is an exact transversal [Eit94]. Conversely, every exact transversal V\ defines a 
deterministic model in this way. We have that C(H) ^ if and only if H has an exact transversal. 

In the same way that probalistic models coincide with the non-signaling polytopes, the classi- 
cality of a contextuality scenario naturally extends that of Bell scenarios. 

Proposition 4.1.4. Let B n .k, m be a Bell scenario. Then C(B n ,k,m) is the standard Bell poly- 
tope. 

Proof. This is clear since one way to define the Bell polytope is as the convex hull of deter- 
ministic models, and a deterministic model in the contextuality scenario B n ^ m is the same as a 
local deterministic model in the Bell sense. □ 

4.2. Non-orthogonality graphs. We will now start to relate contextuality scenarios and the 
associated classes of probabilistic models to graph theory. 

In the (hyper-)graph theory literature, one frequently considers the orthogonality graph of 
a hypergraph (also referred to as its primal or Gaifman graph [GLS01]). The orthogonality 
graph of a hypergraph H is obtained by replacing all edges in H by complete subgraphs on the 
same vertices. This coincides with the orthogonality relation present in the generalized sample 
spaces of [FR72]. Moreover, upon thinking of H as an abstract simplicial complex with E as its 
facets, its orthogonality graph is the 1-skeleton of this simplicial complex. 

For the purpose of relating to the graph theory of Appendix A, however, it will be more 
convenient to consider the complement of this graph. The drawback of this is that it makes some 
of our considerations a bit more confusing, like the proof of Lemma 4.2.2. 

Definition 4.2.1 (Non-orthogonality graph). Let H be a contextuality scenario. The non- 
orthogonality graph Ort(TJ) is the undirected graph with the same vertices as H and adjacency 
relation 

u ~ v <^^> $e e E{H) with {u, v} c e. 

We say that two different vertices u and v of H are orthogonal, which we denote by u _L v, if 
they are not adjacent in Ort(-ff), i.e. if they do belong to a common edge in H. We now make use 
of the concepts discussed in Appendix A, starting with the strong product of graphs [x]. 

Lemma 4.2.2. Let Ha and Hb be contextuality scenarios. Then, 

Ort(#4 ® H B ) = Oit(H A ) ® Ort(H B ). 

Proof. Clearly both sides are graphs having V(Ha) x V(Hb) as their set of vertices, so what 
needs to be shown is that the adjacency relations coincide. 

We first prove that if (ua,Ub) -L (v A ,vb) in Ort(ff^ (g) Hb), then these two vertices are also 
not adjacent in Ort(/^A)0Ort(iJs). The assumption means that there is an edge e e E(H a <S)Hb) 
which contains both (u a ,ub) and (v a ,vb)', this edge has one of the two forms of (3.1). If it is in 
Ea->b> then ua,v a e e A , meaning that u A -L v A - Similarly, if the edge is in Ea^b, then ub -L i>b- 
The conclusion follows from either case. 

For proving the opposite implication, we show that (ua, ub) -L {v A , vb) in Ot^Ha) SOrt(iJs) 
implies the same in Oy^Ha^Hb). The assumption means that ua -L va or ub -L vb', by symmetry, 
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it is enough to consider the case ua -L va- Then, there exists some eA 6 E(Ha) with ua,va g &a- 
Now choose e B ,e B e -Eg such that Mb e e B and vb g e' B , and some function / : — > -Eg with 
/(ua) = e B and /(?m) = e' B . Then 

U W x /( a ) 

aEe A 

is an edge in Ha®Hb containing (ua,ub) and (u^,^), which proves the claim. □ 

4.3. Classicality from the fractional packing number. We now show how to detect 
classicality using a graph-theoretic invariant from Appendix A. 

PROPOSITION 4.3.1. A probabilistic model pe Q(H) isinC(H) if and only if a* (Ort(iT) , p) $5 1. 

Note that the normalization J^ vee p(v) = 1 for every e e E(H) implies that a* (Ort(H) , p) > 1, 
so that the condition a*(Ort(H),p) ^ 1 is actually equivalent to a*(Ort(H),p) = 1. 

PROOF. By definition, a*(Ort(H),p) < 1 means that if q : V(H) — » [0,1] are vertex weights 
satisfying J^ veC q v < 1 for all cliques C c Ort(iJ), then also 

2 ft.P(«)<l- ( 4 - 2 ) 

veV(H) 

In order to prove the claim for all classical p, it is sufficient to consider deterministic p. In this 
case, the associated set V\ = {v e V(H) | p(v) = 1} is itself a clique in Ort(_ff), while all other p{v) 
vanish, and hence (4.2) follows from the assumption on q. 

For the other direction, we use the dual formulation (A. 7) of the weighted fractional packing 
number: there exists a number xc 3= associated to every clique C c Ort(iJ) such that p(v) < 
Hcav x c an d YjC x c = 1- We claim that every C for which xq ¥= corresponds to a deterministic 
model via (4.1); in other words, if xc 0, then |e n C\ = 1 for every e e E(H). First, |e n C| < 1, 
since e is an independent set in Ort(iJ) while C is a clique. Second, the chain of inequalities 

1 = ^ < ^ ^] Xc = ^ Xc s? Yj x c = 1 

vee vEeCBv C with Cne^0 C 

actually needs to be a chain of equalities, which proves the claim that if xc 0, then \e n C| = 1 
for every e 6 E(H). Furthermore, we also conclude that p(y) = Y^cbv x Ci or P = Ec ^-dc- This 
is an explicit decomposition of p as a convex combination of deterministic models. □ 

Problem 4.3.2. Can this result be used to derive a combinatorial characterization of the facets 
of C(H), similar in spirit to Theorem 2.4.3? 

4.4. Classical models on products. 

Proposition 4.4.1. 

C(H A ®H B ) = conv (C(H A )®C(H B )) 1 
where conv(S) denotes the convex hull of the elements in S. 

This is supposed to be seen in contrast to Remark 3.1.6. 

Proof. Let pa g C(Ha) and p B g C(H B ) be deterministic models. Then also pa ®Pb is a 
deterministic model on Ha®Hb, which proves C(Ha® H b ) 2 conv (C(Ha) ® C(H B )) by convexity 
oiC{H A ®H B ). 

Conversely, consider a deterministic model pab on Ha ® H B . Let V\ be the set of vertices 
in Ha ®H b for which pab(v) = 1, and define pa g C(Ha) and ps g C(Hg) as follows: for each 
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va e Va, set pa(va) = 1 if and only if there exists vb £ Vb such that (va,Vb) e V\, and pa{va) = 
otherwise. Similarly, define ps- We want to check that these are indeed probabilistic models, 
i.e. show that 2„ AeeA Pa{va) = 1 and Y 1VbGCb Pb{v b ) = 1 for every edge eA of Ha and es of H B - 
As Vi is an exact transversal of Ha <8> Hb , no two elements of V% belong to the same edge. This 
implies that if both (ua,ub), (u' A ,u' B ) £ Vi, then there is no eA e E{Ha) with {itA,"^} £ e^: 
for if there where, then we could construct an edge as in the proof of Lemma 4.2.2 which contains 
both (ua, ub) and (u' A , u' B ). It follows for each edge e E A , there is at most one vertex va e 
with pa(va) = 1- In fact, there is exactly one such vertex, since eA x £b is an edge on Ha <8> H B 
for any es e E(Hb), and this edge intersects with Vi. Hence, pa is a probabilistic model on Ha- 
Similarly, ps is a probabilistic model. Since p^s = PA ®Pb by construction, the claim follows by 
convexity. □ 



5. Quantum models 

5.1. Definition and basic properties. We denote by B(H) the set of all bounded operators 
on a Hilbert space H. The notation B + (H) stands for the subset of positive semi-definite operators. 
A quantum state p is given by a normalized density operator, i.e. by some p e B +t i{%), where 

B+s(H) = f {peB + (H)\trp=l}. 

Definition 5.1.1. Let H be a contextuality scenario. An assignment of probabilities p : 
V(H) — > [0, 1] is a quantum model if there exists a Hilbert space T-L, a quantum state p e B + ^i(H) 
and a projection operator P v e B{7i) associated to every v e V which constitute projective measure- 
ments in the sense that 

Y j Pv = 1h VeeE(H), (5.1) 

and reproduce the given probabilities, 

p{v) = tr (pP v ) \fv e V{H). (5.2) 

The set of all quantum models is the quantum set Q(H). Thanks to (5.1), it is clear that 
Q(H) c Q(H), i.e. every quantum model is a probabilistic model. 

Proposition 5.1.2. (a) Q(H) is convex, 
(b) Every classical model is a quantum model: C{H) c Q(H). 

Proof. (a) Let p\,pi e Q(H) be quantum models implemented by Hilbert spaces Hi, 
H.2, projection operators Pi. v , P2, v and states pi, pi on the respective Hilbert space. Then 
for any coefficient A e [0, 1], we construct a quantum representation of \p\ + (1 — X)p2 by 
setting 

H = Hi@H 2 , P v = Pi, v ® P 2 .v , P = Api © (1 - X)p2 ■ 

It is immediate to verify that this is indeed a quantum representation of Xpi + (1 — X)p 2 - 
(b) This follows from (a) upon showing that every deterministic model is quantum. A de- 
terministic model p can be seen to be quantum by setting H = C, P v = p(v) ■ 1 and 
p = 1. 

□ 
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5.2. Quantum models on products. What is the set of quantum models on a product 
Ha ®Hb7 The following characterization generalizes the commutativity paradigm of quantum 
correlations in Bell scenarios [JNP + 11, Pril2]. For a related argument, see [CSW10, (iv)]. 

Proposition 5.2.1. Let Ha and Hb be two contextuality scenarios. Then p e Q{Ha® Hb) 
if and only if it there is a Hilbert space H, a quantum state p £ S+ j i('H) and projection operators 
Pa,u £ S(TL), Pb.v £ B("H) assigned to every u e V(Ha), v e V(Hb) such that 

2 P A ,u = In = 2 Pb.v Ve A e E(H A ), e B e E(H B ), 

[Pa.u,Pb,v]=0 VueV(H A ), veV(H B ), 

and the given probabilistic model is reproduced, 

P (u,v) = tT(pP AtU P B ,v) Vu £ V(Ha), v s V(Hb). (5.3) 
Proof. We start from (5.3) and assign to every vertex (u,v) £ V(Ha ®Hb) the projection 

P(u,v) = Pa,uPb,v, 

so that (5.2) holds by (5.3). By symmetry, it is sufficient to show (5.1) for an edge e £ Ea^b given 

by 

e = (J {a} x f(a) with e A e E A , f : e A -»• E B - 

aeeA 

In this case, 

£ P w = J] ^ p ^ = 2 Pa,* ■ In = In, 

which is analogous to the computation in the proof of Proposition 3.1.5. 

Conversely, one can construct the "local" observables Pa,u and Pb,v from a quantum model on 
Q(Ha ®Hb) by noting that the operators 

P A ,u = 2 P(u,v), Pv = 2 P(u,v) (5.4) 

do not depend on the choice of es £ E(Hb) or eA £ E[H A ), respectively. To see this, it is enough 
to prove that 

VEe B v£e B 

for any u £ V(H A ) and es,e' B £ E(Hb), which is analogous to the proof of Proposition 3.3.2. 
Choosing some eA 3 u and considering the function / : eA —* E(Hb) with 



eB if v! = u, 
e' B otherwise. 

An application of (5.1) to the edge defined by / as well as the edge eA x e' B gives 

"Eeg u'€eA\{u) vee' B u'eca vEe' B 

which reduces to (5.5) after cancelling terms. This shows that the "local" operators (5.4) are 
well-defined. 
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The normalization condition ^ u Pa.u = = 2u Pb,v now is an immediate consequence 
of (5.1). Finally, the commutativity [Pa,u, Pb,v] = follows again from the normalization 

2 2 P («>') = 

taken for some a u and es 3 v: the terms in this sum are necessarily mutually orthogonal, 
and hence commute pairwise; but now both Pa,u and Pb,v are partial sums of this big sum, and 
therefore these commute as well. Also, mutual orthogonality implies P( u ,v) = Pa.uPb,v, which 
yields the desired probabilities (5.3). □ 

For example, % might itself be a tensor product Ha <8> Hb , such that every Pa, u operates 
on the first factor, while every Pb.v operates on the second, while p is a state on Ha ®T~Lb 1 
possibly entangled. The question whether every quantum model on Q(Ha <8> Hb) arises, at least 
approximately, from this tensor paradigm is a generalization of Tsirelson's problem [JNP+11, 
Fril2]. 

Problem 5.2.2. Is the set of all quantum models in the tensor paradigm dense in Q(Ha^)Hb)^ 
There are some immediate consequences of Proposition 5.2.1: 
Corollary 5.2.3. 

Q(H A ) ® Q(H B ) c Q(H a ® H B ) (5.6) 

Again, the CHSH scenario £2,2,2 = -81,2,2 <S> -61,2,2 exemplifies that (5.6) is not an equality in 
general. A proof analogous to Proposition 5.2.1 also holds for quantum models on n-fold products, 
and from this we deduce that: 

Corollary 5.2.4. Q(B n ^, m ) is the set of quantum correlations in the Bell sense in the com- 
mutativity paradigm. 

5.3. Quantum contextuality. We conclude this section by formulating the Kochen-Specker 
theorem of [CEGA96] in our formalism: the classical set of the scenario Hks of Figure 2 is empty, 
while the quantum set is nonempty: 

Theorem 5.3.1 (Kochen-Specker). There exists a contextuality scenario H^s f or which 

C(H KS ) = 0, Q(H KS ) * 0. 

It is not clear to us whether Q(Hxs) = G(Hks)> but we suspect that this is not the case. A 
natural question is whether there exists a proof of the Kochen-Specker in which this is the case: 

Problem 5.3.2. Is there H for which 

C(H) # Q(H) = G(H) ? 

Some hypergraph H constructed from the GHZ paradox [GHSZ90] might be a good candidate 
for this hypothetical phenomenon. 

Proposition 5.3.3. There exists H as in Problem 5.3.2 if and only there exists some H' with 
a unique probabilistic model which is quantum, but not classical. 

Proof. Clearly if such an H' exists, then we can take H = H' in Problem 5.3.2. Conversely, 
such an H' can be constructed as an induced subscenario of any H of Problem 5.3.2 by using 
Theorem 2.4.3, whose proof adapts immediately to show that the resulting unique probabilistic 
model H' will also be quantum. □ 
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6. A hierarchy of semidefinite programs characterizing quantum models 

For Bell scenarios, there is a sequence ("hierarchy") of semidefinite programs characterizing 
quantum correlations with the commutativity paradigm due to Navascues, Pironio and Acin [NPA07, 
NPA08]. Here, we extend this hierarchy to contextuality scenarios. This may be considered a spe- 
cial case of the general hierarchy for noncommutative polynomial optimization [PNA10]. 

6.1. Definition of the hierarchy. We introduce the main idea before getting to the technical 
details. Given a quantum model as in Definition 5.1.1, not only can one consider the expectation 
values tr (pP v ), but also any expectation value of the form 

tr(pP Vl ...P Vn ), (6.1) 

where v = vi . . . v n e V{H) n is any finite sequence of vertices. The idea is to find properties of these 
values which characterize quantum models. For n > 2, these values are typically not determined 
by the probabilities p(v) = tr (pP v ) alone; the hierarchy works with these quantities as unknown 
variables whose values have to be determined in such a way as to be consistent with arising from a 
quantum model as in (6.1). 

Now for some notation. As a shorthand, we also write P v for P Vl . . . P Vn , although this is in 
general not a projection. When V is a set, we write V* n for the set of all strings of up to n elements 
)k* n V", and 7* =LU' 



of V, i.e. V* n = \J k<n V k , and V* = {J km V k for the set of all strings over V. e V* is the 



def 

empty string of length and the associated operator is Pq = 1. For v = v\...v n a string, we 
denote its reverse by v' = v n . . . V\. This notation makes sense in our context since P v t = -Pj. For 
v e V* and w e V* , we write their concatenation simply as vw e V* , so that P vw = P v -Pw We 
also use «i . . .Uj . . .v n as a shorthand for V\ . . . Vi_iVi + i . . . v n . 

Lemma 6.1.1. Let p e Q(H) be a quantum model with v >— > P v e B + (H), p e B + .i('H). Then 
the matrix M indexed by v, w e V(H)* n with entries 

M ViW = tr ( P P v Pl) (6.2) 

has the following properties: 

(a) M is positive semidefinite. 
(b) 

M , = 1. (6.3) 

(c) For every e e E{H), 

Yj M v*,w = M v>w . (6.4) 

(d) Ifv n 1 w m , then 

M Vl ... VntWl ... Wm = 0. (6.5) 
Hermiticity of M implies that (6.4) also holds with x appended to w rather than v. 

PROOF. (a) It needs to be shown that for any vector x e C v< - H ^* with components 
x v eC,ve V(H)* n , the expression 

^ i •£v-^v,w-£w- 
v,w 

is nonnegative. By the definition (6.2), this is equal to 

2j tr (p xlP v P^x w ) . 
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With Q = 2 V %vPv> this is of the form tr (pQ^Q), and therefore indeed nonnegative. 

(b) Since p is a normalized state, M0 i = tr(p) = 1. 

(c) The requirement (5.1) implies that 

xee 

from which (6.4) directly follows. 

(d) This is a direct consequence of P Vn _L P Wm for v n _L w m . 



□ 



The diagonal entries M ViV represent the expectation values tr (pP Vl ■ ■ -Pv n Pv„ ■ ■ which 
can be interpreted as the probability to obtain the sequence of outcomes (vx, . . . ,v n ) in some 
measurement sequence (ei, . . . , e„) with Vi e Vi. We suspect that this interpretation can be used 
to find an interpretation of the "higher" levels of the hierarchy in terms of the lowest level of a 
temporally extended scenario, but we have not been able to make this idea work. 

Definition 6.1.2. Let H be a contextuality scenario. We say that p : V(H) — > [0, 1] is a 
Q n -model if there is a positive semidefinite matrix M with entries M v w , v, w e V(H)* n , such 
that (6.3), (64), (6.5) hold and 

p(v) = M v>0 . (6.6) 

Again, it is easy to verify that every Q„-model is a probabilistic model. By definition, testing 
whether a given probabilistic model lies in Q n is a semidefinite programming problem of size 
|V^(i/)|™ x \V(H)\ n . By making judicious use of the equations (6.4) and the upcoming (6.10), this 
size can be significantly reduced if H has many edges; any practical computation should take this 
into account. Furthermore, it can be assumed that all matrix entries are actually in K, i.e. no 
imaginary components are needed. 

Remark 6.1.3. Besides those of Lemma 6.1.1, there are other properties satisfied by M which 
follow from (6.3)-(6.5), and are satisfied in particular by those M of the form (6.2): 

(a) If W = v'w't, then 

M V)W = M V ', W /. (6.7) 

This follows by induction from M Vl ...v m ,<w = M Vl ... Vm _ ltWVm , which in turn is a consequence 
of (6.4) and (6.5) upon choosing some e 3 v m , 

M <¥)y M 

xee 

and applying the same trick on the other side shows that this also equals M V1 ... Vm _ liVr , as 
claimed. Equation (6.7) implies in particular that all matrix entries M ViW are determined 
by those of the "first row" , i.e. those of the form M0 iV , although this requires v e V(H)* 2n . 

(b) Repeating one letter in the index string gives the same matrix entry, 

Upon using (6.7), this follows from a very similar argument. 

(c) For every e e E{H) , 

M v ,w = M, 1 ...« j ... I)niW . (6.9) 

ViE.e 

This is a consequence of (6.4) and (6.7). 
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(d) Having subsequent orthogonal indices makes the matrix entry vanish, 

Vi 1 Uj+i => M Vl ... VtVt+1 ... VmtW = 0. (6.10) 

This follows from (6.9) together with (6.8). 
(c) Choosing some e a v and applying (6.4) and (6.5) also shows that 

M v , = M„,„. (6.11) 

In particular, p(v) = M ViV by (6.6). 

Definition 6.1.2 is our semidefinite hierarchy for contextuality scenarios. In the special case of 
a bipartite Bell scenario B2,k,m, our hierarchy is equivalent to the original one [NPA07, NPA08], 
although our level "n" is somewhat different from the hierarchy level "n" used in [NPA08]. In 
particular, our set Qi(B 2 ,k, m ) is the set Q 1+AB of [NPA08]. 

Proposition 6.1.4. 

Q(i/)c...cQ n (ff)c...cQ 1 (F). 

PROOF. Every matrix M showing that p is a Q„ + i-model can be restricted to a matrix showing 
that p is a Q„-model, so that Q n +i(H) c Q n (H). Lemma 6.1.1 shows that every quantum model 
is a Q„-model, which means that Q(H) E Q n (H). □ 

6.2. Convergence of the hierarchy. We can also consider infinite matrices M with entries 
M ViW indexed by strings of arbitrary length v, w e V(H)*; starting from a quantum model and 
considering (6.2) as the resulting definition of the matrix, the same proof as before shows that the 
properties of Lemma 6.1.1 still hold, if we take positive semidcfiniteness to mean that 

2 x*M v>Vf x w > 

for all finitely supported (x v ) VE v(H)* ■ 

Proposition 6.2.1. If such an infinite matrix exists, then p e Q. 

Proof. Such an infinite matrix M can be understood to be a (*-algebraic) state on the 
*-algebra with generators {P v , v e V(H)} and relations 

P v = Pi = P*, J] P v = 1 Me e E(H) (6.12) 

via the assignment 

<f>(P Vl ...P Vn ) = f M Vl ...v n ,0- 

and extending by linearity. Then, the GNS construction (see e.g. [KR83]) turns this into a quantum 
representation satisfying (6.6). For this reason, a probabilistic model is quantum if and only if there 
exists such an infinite matrix M having the properties of Lemma 6.1.1. 
More concretely, this works as follows. First, we claim that 

/ , x v A1- vu - wu Xt JV =$5 / 1 x v Af v _ w x w . (6.13) 

v,weF(H)* v,weV(.ff)* 
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To see this, choose any e 3 u and write 

v,weV(-ff)* v,weV(_H")* \u'ee, / 

u'Se, ti'#« v,we^(£f)* 

where the last inequality is due to positive semidefiniteness of M. This proves (6.13). 

Now we start the construction with the infinite-dimensional vector space spanned by all strings, 

n = f ]in c (V(H)*). The formula 

( 2 x v v, ^] x w w \ = J] x*M VjW x v . 

\veV(H)* weV(ff)* / u,vEV(ff)* 

defines a positive semidefinite inner product on in terms of the matrix M . The Cauchy-Schwarz 
inequality shows that 



N d = I £ x v ve% 



^M.^vV > = 



is a linear subspace of "Ho. The inner product on the quotient Ho/Af tb^is positive definite by 
definition. We take H to be the completion of T-Lq/N with respect to t" j norm coming from this 

/ 



inner product. 

Now for u e V(H), the operator P u is defined to act on Hq as 



PA 2 *vV « 2 

\veV(.ff)* / vEV(if}* 

Thanks to (6.13), this descends to a well-defined operator in B(H), which we also denote by P u . 
The equation M VUlVt = Af v wtl guarantees that P u is self-adjoint, while M vl 1 ' ' 

that P% = P u since 

^ x v (vuu — vu) e N ' . 

VEV(fl)* 

The equation Xltiee ^« = ^« holds since 



Wff)* V uee ) 



veV(H) 

thanks to (6.4). Finally, the rank-one density operator associated to the empty string e H is the 
desired quantum state, since 

(0,Pu0) = M .u=p(u). 
This ends the GNS construction. □ 

From this reasoning, we find that the sequence of sets (Q n )neN converges in the following sense: 
Theorem 6.2.2. 

Q = H Qn- 
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Proof ([NPA08]). It needs to be shown that if p e Q n for all neN, then p e Q. To this end, 
we show that if a matrix (M" w ) V)We y(m* n exists with the required properties for every n, then 
there also exists a corresponding infinite matrix (M^ w ) vv ,ev(h)* . 

For v e V(H)* n , positive semidefiniteness gives the estimate 

(M^) 2 (6 = ?) (i\O, ) 2 < Ae\ • M% = 
which implies Mj" 1, and hence 

\M 2n I 2 < M 2n M 2n s£ 1 

again thanks to positive semidefiniteness. We obtain Mj w e [—1, +1] for all v, w e V(H)* n with 
n s£ 2k. 

Now consider the truncation of any M 2n to a matrix indexed by v, w e V(H)* n . Upon filling 
this truncation up with O's, we obtain an infinite matrix indexed by v, w e V(H)* with all elements 
in [—1, +1]. In this way, every matrix M n becomes an element of [—1, xV(H) r^,^ S p ace 

[— 1, +l] y ^^ xV(H) ^ equipped with the product topology, is second countable, and also compact 
thanks to Tychonoff's theorem. Hence, the sequence (M n )„ e pj has a convergent subsequence, and 
we write for its limit. By construction, this M x is an infinite matrix indexed by v, w e V(H)* 
having all the desired properties. The claim now follows from Proposition 6.2.1. □ 

Since each Q n {H) is defined in terms of a semidefinite program, we say that this represents a 
hierarchy of semidefinite programs characterizing Q(H). It is a subfamily of the hierarchies of 
semidefinite programs in noncommutativc optimization introduced in [PNA10], which generalize 
the "commutative" hierarchies originally discovered in the context of convex optimization [Las02] . 

6.3. Equivalent definitions of Qi and the Lovasz number. The following is a reformu- 
lation of the set "£qm" considered in [CSW10]. 

Proposition 6.3.1. Forpe Q{H), the following are equivalent: 

(a) peQ x {H); 

(b) There exists a Hilbert space H, a unit vector e % and a vector \4>v) for every v e V(H) 
such that 

(i)ulv => (<p u \<p v ) = 0, 
(«) 2 flee |^>=l*> VeeE{H), 
(in) p{v) = (4> v \4>v); 

(c) There exists a Hilbert space T-i, a unit vector |^) e T-i and a unit vector \ip v ) for every 
v e V(H) such that 

(i) uiv => (4>u\4>v) = o, 

(ii) p(v) = |<^|*>| 2 ; 

(d) There exists a Hilbert space Ti, a unit vector |\&) e Ti and a projection P v for every 
v e V(H) such that 

(i) itlo P u 1 P v , 

(ii) p( v ) = (y\P v \^) \/v e V{H); 

(e) There exists a Hilbert space Ti, a unit vector |^) e Ti and a projection P v for every 
v e V(H) such that 

^S„ Ee ^<l« VeeE(H), 
(ii) p(v) = <*|P„|^> Vt; e V(H); 
In all cases, Ti can also be taken to be the real Hilbert space 
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Proof. (a)=>(b): By positive semidcfinitcncss, we can write M as a Gram matrix, so that 
there exist vectors I*), \<j) v ) in H = Rl y (^)l such that 

M 0,0 = <*l*>. M 0,v = Mi,,, = <^|0„>, 

from which (b)(i) and (b)(iii) follow. 

Now we fix e e E(H) and show (b)(ii). We decompose |^) into orthogonal components 
|ijr> = 1*11) + j^-L), where |*H> e ]in c {\(/> v ) : wee}. Due to (6.4) and (6.5), the vectors 
satisfy: 

(<p v \yy = M v , = J] M vu = M v , v = {4> v \4> v ). 

uee 

Then the equations 

<</>„!#> = {4>v\4>v) 

imply that |\I>II) = ^ vee \4>v)- On the other hand, 

<*U|*ii> + <*- l |*- l > = m 010 = 5]m , v = 2 m u , v = 2 <^|^> = <*ii|*"> 

vee v,uee v.uee 

shows that l^ 1 ) = 0, so that Yivee l&>) = I*)' as desired. 
(b)=>(c): Normalizing the \(f> v ) to \4> v ) = ? 1 \<t>v) guarantees the orthogonality relations, and 



((t>v\(t>v) 2 = ((t>v\<t>v), 



choosing some edge e e E(H) with vee gives 



K^l*>l° 1 



(4>v\4>v) 



5> 



2 



((t>v\4>vy 



due to the orthogonality relations. 

(c) =>(d): Define P v = \ip v )(H- 

(d) =>(e): This is clear since for fixed e e E(H), all projections P v for vee are mutually orthogonal, 

which implies J]„ Ee P v <, In- 

(e) =>(a): Define M v>w = <^|P«P^|*>. We check that M satisfies conditions (6.3) to (6.5) and is 

positive semidefinite: 

(6.3) M . = <*|*> = 1, since |*> is a unit vector. 

(6.4) Consider an edge e e E. Since p(v) is a probabilistic model, 

<*|*> = i = J]pH=<*| 2 

-uEe wEe 

which implies Pul*) = |*>- Then, 

2 M v . w = <*| J] P,P tt ,|*> = <*|P„,|*> = M 0iW . 

'oEe vee 

(6.5) If w _L w, then there is an edge e e E(H) with w, u; e e. Hence, P v _L P w , so that 
M v<w = (V\P V P W \V) = 0. 

Positive semidefiniteness of M can be shown as in the proof of Lemma 6.1.1. 

□ 

PROPOSITION 6.3.2. A probabilistic model p e Q{H) is in Qi if and only if #(Ort (/?),/?) < 1. 
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Proof. We use the characterization of Qi(H) given in Proposition 6.3.1(c). Assuming p e 
Qi(H), we choose corresponding vectors \ipv), 6 Rl^WI; then, by Definition A. 3.1, 

#(Ovt(H),p) < max ^) = L 

Conversely, if i?(Ort(i?),p) < 1, then there is an orthonormal labeling (\ip v )) v eV(H) an d a vector 
|*> e such that |<^|^>| 2 > p(«) Vu. By choosing "H = Rl v ^ © Rl y ( ff )l and setting 



where the \e v ) form the standard basis of M) V ( H ^, one obtains |(^|^)| 2 = p{v) with unit vectors 
\tp' v ), as desired. □ 

This relation to graph theory has a simple first application: 

PROPOSITION 6.3.3. 

Qi{H A )®Q l {H B ) cz Q X {H A ®H B ) (6.14) 

Proof. Combine Proposition 6.3.2 with multiplicativity of "ft (Proposition A. 3. 10). □ 

Again, the CHSH scenario i?2,2,2 = -81,2,2 ® -81,2,2 exemplifies that (6.14) is not an equality in 
general, even after taking the convex hull on the left-hand side. 

7. Consistent Exclusivity and Local Orthogonality 

7.1. Introducing Consistent Exclusivity. It is a fundamental property of quantum theory 
that the compatibility of observables is a binary relation: if a collection of quantum observables 
is such that they commute pairwise, then it follows that there is a basis in which all of them are 
diagonal, so that a measurement in that basis can be coarse-grained into a measurement of each 
observable. Paraphrasing Specker [Spe60] 2 , 

A collection of propositions about a quantum mechanical system is precisely then 
simultaneously decidable, when they are pairwise simultaneously decidable. 
For us, this means the following: suppose that / c V(H) is a set of vertices in a contextuality 
scenario H such that every two of them belong to a common edge; by definition of Ort(ff), this 
means precisely that I is an independent set in Ort(H). Then the associated projections (P v )vei 
for any quantum model p e Q(H) have the property that of being pairwise orthogonal, and hence 
T,vei P v ^ !«• This implies 

^>( V ) = 2tr(pP„Hl. 

vel vel 

Definition 7.1.1 ([Henl2]). A probabilistic model p e Q(H) satisfies Consistent Exclusiv- 
ity if 

5>(«)0 (7.1) 
vel 

holds for any independent set I c V(Ort(/f )) . We write ^S^iEf) c Q(H) for the set of probabilistic 
models satisfying CE. 



2 We thank Adan Cabello for pointing us to [Spe60] and kindly sharing a preliminary version of [Cabl2c], 
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We also write CE 1 for this version of CE in order to distinguish it from the upcoming refinement 
termed CE 00 . We refer to [Cabl2c] for an exposition of the history of principle and in which 
contexts it has been applied. 

Intuitively, CE is saying that the total probability of any collection of pairwise exclusive out- 
comes is < 1. In this formulation, CE may almost sound like a trivial consequence of the laws of 
probability; however, this is not the case, since the probabilities p(v) of a probabilistic model are 
conditional probabilities representing the probability that outcome v occurs given that a measure- 
ment e with v e e has been performed. 

Proposition 7.1.2. (a) Q(H) S c gS x (H) for every H. 
(b) There exist H with C €S X (H) c Q(H). 

Proof. (a) Above. 

(b) For the triangle scenario A of Figure 3, V(A) is itself an independent set in Ort(A). 
Since Xmev(A) p( v ) = I f° r tne unique probabilistic model p, this model violates CE. We 
conclude that ^ X (A) = 0, although 0(A) = {p}. 

See [LSW11] for further discussion of this example and [FSA + 12] for examples in 
multipartite Bell scenarios. 

□ 

In [CSW10], Consistent Exclusivity was imposed in the very definition of probabilistic models. 
The problem with this is that the collection of models satisfying it is not closed under (g), as we will 
see in the following. Aside from the unclear physical meaning of CE, this is the main reason why 
we prefer our Definition 2.3.1: it guarantees that if pa andps are probabilistic models on Ha and 
Hb, respectively, then pa®Pb is also a probabilistic model on Ha <8> Hb; see Section 3.1. 

By the very Definition A. 3.1 of the weighted independence number, we have: 

Proposition 7.1.3. pe <££ 1 {H) if and only if a(Ort(H),p) sS 1. 

Again, due to the normalization equations J^ vee p(v) = 1, the statement a(Ort(H),p) ^ 1 is 
actually equivalent to a(Ort(H),p) = 1. 

7.2. Local Orthogonality. The concept of Local Orthogonality (LO) was recently intro- 
duced in [FSA + 12] as an information-theoretic principle satisfied by all quantum correlations in 
Bell scenarios, but violated by many non-quantum no-signaling boxes. The main reason for con- 
sidering LO is the search for "physical" principles characterizing quantum correlations. It seems 
intuitively related to Consistent Exclusivity; here we would like to explain in which sense it is indeed 
a special case of CE when using our definition (3.7) of Bell scenario. 

Recall [FSA + 12] that we call two events u = a\ . . . a n \x\ . . . x n and v = a\ . . . a'^\x\ ... in a 
Bell scenario locally orthogonal if there is a party i with <ij ^ a' i: but Xi = x\. We now show that 
two events are locally orthogonal if and only if they are different vertices belonging to a common 
edge in the hypergraph B„,fc, m : 

Lemma 7.2.1. The events u, v e V(-Bn,fc,m) ore locally orthogonal if and only if u _L v. 

Proof. Suppose that u = a\ . . . a n \x\ . . . x n and v = a[ . . . a' n \x' x . . .x' n are locally orthogonal. 
By relabeling the parties, we can arrange for a\ ^ and x\ = x[. Now choose any functions 
/2, . . . , /„ with fi(ai) = Xi and fi(a! x ) = x[. Then the set of events of the form 

b 1 ... b n \x 1 f 2 (b 1 ) . . . f n (h) 
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defines an edge in B n ^ m containing both u and v. Intuitively, Alice communicates her outcome to 
the other parties who then choose their measurement settings as a function of that outcome. 

Conversely, ulu means that there is an edge e e E{B n> k,m) with u, v e e. More concretely, this 
states that there is an ordering of the parties <r(l), . . . , cr(n) and functions f a u\ {b a n\ , . . . , 6<7(i-i) ) 
such that e contains exactly those events which have the form 

b a (l) ■ ■ • &<7(n)|/ff(l)0 ■ • • /cr(n) (&ct(1) i ■ ■ • ' K(n-t)) 
where we have now written the parties in the order given by the permutation a. Since both given 
events u = a\ . . . a n \x\ . . . x n and v = a[ . . . af n \xi . . . x' n are assumed to be of this form, we know that 
x a {i) = /<7(i) (a<r(i) , ■ ■ ■ ,o<r(i-i)) and a£(i) = /<r(i)( a CT(i)' ■ ■ • i a ' CT (i-i))- Now let a U) be tne smallest 
index with a a (j) ¥= a ' a (jy Then, since x a (j) and ^(j) om y depend on a a ^ and a' a rn with i < j, we 
conclude that a^y) = x ' a u\ > which proves the claim. □ 

Hence, when working within our framework for contextuality scenarios, the LO 1 principle stud- 
ied in [FSA+12] becomes a special case of CE 1 of Definition 7.1.1; the orthogonality between two 
events naturally arises from the FR product. Those readers not familiar with [FSA+12] may re- 
gard this as the definition of LO 1 . In [Cabl2b], this relation between LO 1 and CE 1 was already 
implicitly used. 

Problem 7.2.2. In [FSA+12], we have introduced LO 1 , which we now know to define ^ ,1 (B n ,/t,m)i 
as a principle limiting information processing power. Can this characterization of be extended 
to all contextuality scenarios? 

Problem 7.2.3. In [FSA+12], we also showed that LO 1 is equivalent to the no-signaling 
principle in bipartite Bell scenarios, i.e. \B2,k,m) = G(B2.k.m)- More generally, under which 
conditions on H does ^S l {H) = Q{H) hold? 

7.3. Consistent Exclusivity and the Shannon capacity of graphs. If p e Q(H) is a 
probabilistic model which is realizable in a world obeying certain physical laws, then it is reasonable 
to assume that any p® n e Q(H® n ) is realizable as well, since it simply corresponds to conducting 
n copies of the same experiment in parallel. If we regard CE as delimiting the set of physically 
realizable probabilistic models, then this means that if p® n $ ^S X {H m ), then we already know 
that p itself is not physically realizable. Therefore we put: 

Definition 7.3.1 (CE hierarchy of sets). Let H be a contextuality scenario and p e Q{H). We 
write p e tfg n (H) if and only if p® n e c gS x (H® n ). Furthermore, 

This is indeed relevant since, as we saw in [FSA+12], for example ^(B^i) ^ ^ "(£2,2,2)- 
See [Cab 12b] for another example showing that violations of CE can be "activated" by considering 
copies p® n of the same model p. If p e <<f£ k (H), then we also say that p satisfies CE fe . In particular, 
p e ^^(H) if and only if p e <gg n (H) for all n e N, in which case we say that p satisfies CE 00 . In the 
special case of Bell scenarios, our previous results imply that CE 00 is precisely LO 00 of [FSA+12]. 

We now relate the ^S* family of sets to the graph-theoretical invariants of Appendix A. 

Lemma 7.3.2. (a) p e ^S n (H) if and only if a(Ort(iJ) s ",p®") ^ 1. 

(b) p e <tfg co (H) if and only ifQ(Ort(H),p) ^ I, or, equwalently, if a(Ort(H) , p) = 6(0rt(#),p) = 
1. 
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PROOF. (a) By definition, p e ^g n (H) if and only if a(Ort(iZ"® n ),p® n ) < 1. The claim 

now follows from Lemma 4.2.2. 
(b) The first statement holds by the definition of © (A. 6). For the second statement, p e 
<gg™{H) implies that Q(Ort(H),p) < 1. But since a(Ort(£f), p) = 1 due to p e <gg x {H), 
we find 8(0rt(-ff),p) = 1 = a(Ort(H),p). The converse is clear. 

□ 

If follows from Corollary 5.2.3 that Q(H) E <tfg™{H), 

Lemma 7.3.3. For every k,neN, the following inclusions hold: 

<£S™(H) c ... c . ..tfg n (H) c ... c <£g x (H). 
This should be seen in contrast to Remark A. 1.3. 

Proof. We choose any p e "if<f *(#). Thanks to Corollary A.3.11, we know 
a(Ort(i?) H ",p®") > a(Ort(ff) E ( n - 1 ),p®( n - 1 )) • a(Ort(2T),p). 

Now since a(Ovt(H),p) = 1, the sequence (a(Ort(if) STl ,p® n )) neN is monotonically nondecreasing. 
The claim now follows from Lemma 7.3.2. □ 

7.4. Does Consistent Exclusivity characterize the quantum set? In [FSA+12], we 

considered < ^o ,tx \B n ^, m ) for Bell scenarios -B n ,fc,m an d asked whether it coincides with Q(-B n ,fc,m)- 
We will answer this question now. 

Proposition 7.4.1 (Navascues). For every H, 

Qi{H) c ^(H). (7.2) 

This observation was first made by Miguel Navascues, before this whole formalism had been 
set up. We are now in a position to give an essentially trivial proof. 

Proof. Combine Propositions 6.3.2 and Lemma 7.3.2 with A. 3. 9. □ 

In particular, together with Q(H) c Qi(H), this gives another proof of Q(H) c ^S' ao ^H), even 
if an excessively more convoluted one. This completes our exposition of Figure 1. 

Corollary 7.4.2. In the CHSH scenario, the LO principle does not characterize quantum 
models: Q(S 2 ,2, 2 ) £ c €§ a \B 2 , 2 , 2 ). 

Proof. From 7.4.1, since Q(B 2 ,2,2) £ Qi (#2,2,2) [NPA08]. □ 

Theorem 7.4.3. There are contextuality scenarios H for which Qi(H) c ^f 00 ) . 

Proof. Our Proposition 6.3.2 and Lemma 7.3.2 suggests that this is related to the existence 
of graphs G for which a(G) = 0(G) < $(G). And indeed, we will turn Haemers' example [Hae81] 
of this phenomenon into an example of a contextuality scenario J n with a probabilistic model 
pj e c €§ a \H) with PJ $ Qi(H). 

Let n > 12 be an integer divisible by 4. Let J n have vertices V( J n ) being all 3-element subsets 
of {1, . . . , n}. An edge of J n is given in terms of a partition of {1, . . . , n} into 4-element subsets; a 
vertex (3-element subset) belongs to the edge if and only if it is contained in one of the subsets of 
the partition. 

By construction, all e e E(J n ) have cardinality |e| = n, since every partition consists of n/4 
subsets and each subset hosts 4 vertices. Therefore, assigning a weight of — to each vertex defines a 
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probabilistic model pj. Now the non-orthogonality graph Oit(Hj) consists of the 3-element subsets 
of {1, . . . , n} two of which are adjacent if and only if they have exactly one element in common. 
This is the graph that was considered by Haemers [Hae81], who showed that 

a(OTt(Hj)) = 6(Ort(ff 7 )) = n < 0(Ort(fTj)). 

Since the probabilistic model pj has constant weights —, this means that 

a(Ovt(Hj),pj) = Q{Ovt(Hj), Pj ) = 1 < 4(prt(Hj),pj), 

and hence pj e tfS"* (H j) , but pj $ Qx{Hj). □ 

7.5. Convexity and activation of Consistent Exclusivity. Since ^S 1 is defined in terms 
of linear inequalities, it is obviously convex. But what about the convexity of c £S n for n ^ 2 and 
cgg^l Also, what about the analog of Proposition 6.3.3 for the family of sets? We will now 
see that these questions are intimately related to some challenging open problems in combinatorics 
which we introduce in Appendix A. Again, all the following results specialize to results about LO 00 
in the Bell scenario case. 

Conjecture 7.5.1. For all contextuality scenarios H, H A , Hb, 

(a) ^g r - c {H A )®^g r *{H B ) c tfg 1 {H A ®H B ); 

(b) tfg' x> {H A )®<tfg QO {H B ) <= C €£*\H A ®H B ); 

(c) W^lH) is convex. 

Conjecture (b) states that the activation of non-membership in < ^ aC0 is impossible: if it fails, 
then there are H A and H B with p A e C ^(H A ) and p B e ^ °°(# B ), but p A ®p B $ < ^? GC \H A ®H B ), 
which we interpret as saying that p A and ps activate each other. Thanks to Corollary 5.2.3, such 
an activation would show that p A $ Q(H A ) or ps $ Q(Hb). 

Theorem 7.5.2. (a) Each of these conjectures follows from its counterpart in Conjec- 
ture A. 2.1. 

(b) Conjectures (a) and (b) are equivalent and imply (c). 

PROOF. (a) Combine Proposition 6.3.2 and Lemma 7.3.2 with Propositions A. 4. 4 and A. 4. 6. 
(b) Conjecture (b) clearly implies (a). For the converse, suppose that we havep.4 e t rf£ co (H A ) 
and p B e ^ K (H B ) with p A ®p B $ ^^(H A ® H B ). Then there exists some n e N 
with (p A ®p B )® n i ^(.Hf™ ® iff"). This would mean that pf" 6 ^°°(i?f") and 
p® n e c gg (X, (H® n ) was a counterexample to 7.5.1(a). 

Concerning the implication from (a) to 7.5.1(c), we consider pi,p2 e ^^(H) and 
deduce pf k ®pf (n ' k) e ^(ffS") from assumption (a). Due to convexity of ^{H® 71 ), 
this shows that for any A e (0, 1), 

(Api + (1 - X) P2 f n = J Q A fe (l - \) n - k pf k ®pf n - k) e VS\H® n ), 

so that (Api + (1 — X)P2) £ ^?™(-ff). Since n was arbitrary, this means (Api + (1 — X)p2) e 
< i^ co (H), as was to be shown. 

□ 

While we suspect each of the Conjectures 7.5.1 to actually be equivalent to its counterpart 
in A. 2.1 (and in A. 4. 2), we have not been able to prove this yet. 
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FIGURE 8. A scenario H with Q(H ) = C(H ), although Oit(H ) is not perfect. 
The two nodes labelled v represent the same vertex. 

Failure of (b) or (c) would lead to an obvious way to strengthen the CE principle: the collection 
of physically realizable probabilistic models should be both convex and closed under (x). Therefore, 
if some physically realistic q £ %r<f °°(.ff) can be combined with some p e e ffS as {H") by using convex 
combinations and (^-products such that the combination is not in < rf<? 00 , then p itself should be 
considered to violate the CE principle in a certain extended form. 

7.6. Contextuality and perfection. We now study under which conditions C(H) coincides 
with *€£ (Ft). Recall that a graph G is called perfect if the chromatic number of any induced 
subgraph is equal to the clique number of this subgraph [Ber61]. 

Proposition 7.6.1. If Ort(iJ) is perfect, then C(H) = C £S 1 {H), although Q{H) can still be 
bigger. 

Proof. By the weak perfect graph theorem of Lovasz [Lov72], we can as well assume the 
complement Ort(iJ) to be perfect. A probabilistic model p e < ^ sl (i?) can be interpreted as vertex 
weights p(v) for v e V(H) with J^ v€ cP( v ) ^ 1 f° r ever Y clique C in Ort(_ff). Then, perfection 
guarantees [Knu94, Thm. 31] that p is a convex combination of indicator functions of independent 
sets in Ort(_ff), i.e. there are cliques Ui, . . . , in Ort(_ff) and coefficients e [0, 1] with 2, K = 1 
such that 

k 

p=Y 1 ^lu,- (7.3) 

i=l 

We now claim that every 1^ is a deterministic model. Since its weights clearly take values in {0, 1}, 
it is enough to verify the normalization condition J^ VEe li/i( v ) = 1 f° r a U e e E(H). But this follows 
from (7.3) together with J] vee p(v). 

In order to see that G(H) can still be bigger, consider again the triangle scenario A depicted 
in Figure 3. There, C(A) = 'loS (A) = 0, although A allows a probabilistic model. □ 

The converse to Proposition 7.6.1 is not true: 

PROPOSITION 7.6.2. For the scenario depicted in Figure 8, G(H ) = C(H ). However, Ort(iTo) 
is not perfect. 

Proof. Ort(i?o) is not perfect since its complement Ort(iJo) contains the pentagon O as an 
induced subgraph in the left part. 
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On the other hand, every probabilistic model p on Hq is guaranteed to satisfy p(v) = 1 due 
to the structure on the right. Hence, p(u) = for all u in the pentagon. Therefore, both Q(H ) 
and C(Hq) can be identified with their counterparts for the right part H R of Figure 8. Since 
every maximal independent set in Ort(H R ) is itself an edge, we get < ^S 1 {Hr) = Q(H R ), and since 
Ort(H R ) is perfect, we have C{H R ) = tfg\H R ). □ 

Forcing the vanishing of the weights in the pentagon may seem like a cheap trick. However, 
we don't know of any natural combinatorial condition which one could impose on a contextuality 
scenario in order to exclude such pathological behavior of Q(H). In particular, the proof of Shultz's 
Theorem 2.3.5 uses similar "forcing" ideas [Shu74]. 

See Proposition 9.3.3 for a slightly less artificial example of a scenario AP 4 with <2i(AP4) = 
< ^? 1 (AP 4 ), although Ort(AP 4 ) is not perfect. 

Theorem 7.6.3 (Strong perfect graph theorem [CRST06]). A graph G is perfect if and only 
if neither G nor G contains an induced subgraph which is a cycle of odd length > 5. 

In combination together with Proposition 7.6.1, we obtain: 

Corollary 7.6.4. // neither Oxt(H) nor Ort(-ff) contains an odd cycle of length 5= 5 as an 
induced subgraph, then C{H) = c tfS' 1 (H). 

In this sense, every (quantum) contextuality proof must rely on a "cycle-like" contradiction as 
it appears in the Klyachko-Can-Binicioglu-Shumovsky scenario (see [KCBS08] and Section 9.2), 
or on an "anti-cycle- like" contradiction. Within the framework of [CSW10], this observation is 
due to [CDLTP12], where the anti-cycle case has been studied in a bit more detail. 

8. Complexity of various decision problems 

We now study the computational complexity of various decision problems associated to con- 
textuality scenarios. 

8.1. Deciding existence of probabilistic/classical models. The most basic decision prob- 
lem about contextuality scenarios is this: 

Problem: ALLOWS.GENERAL 
Input: A contextuality scenario H, 
Output: g(H) * 01 

Recall that there are indeed contextuality scenarios without any probabilistic models, for ex- 
ample Figure 4. Determining the complexity of ALLDWS_GENERAL is very basic as well: 

PROPOSITION 8.1.1. ALLOWS_GENERAL is in P. 

PROOF. Determining whether Q(H) is a linear program. □ 

Now on to the analogous question about classical models: 

Problem: ALLOWS.CLASSICAL 
Input: A contextuality scenario H, 
Output: C{H) ^ 01 

Proposition 8.1.2. ALLOWS_CLASSICAL is NP '-complete. 
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Proof. ALLDWS_CLASSICAL can be identified with the class of Boolean satisfiability problems 
which are disjunctions of clauses, where each clause states that exactly one variable in a certain 
subset of all variables needs to have the value TRUE. Given this, NP-completeness follows from 
Schaefer's dichotomy theorem [Sch78]. Notwithstanding, we now offer an explicit proof. 

First, ALLDWS_CLASSICAL is clearly in NP: any explicit deterministic model p : V(H) -» {0, 1} 
witnesses C(H) 0. 

Let xx, . . . ,x n be Boolean variables and 

B = (yu v yi2 v yi 3 ) a ... a (y rn \ v y m 2 v y m 3) (8.1) 

be a logical formula in which each y^ stands for some x\ or its negation — <xi- The Boolean satisfi- 
ability problem 3SAT is the following decision problem: 
Problem: 3SAT 

Input: a logical formula B in the form (8.1), 
Output: Is B satisfiable? 
This is known to be NP-complete [Kar72]. We now prove NP-hardness of ALLDWS_CLASSICAL 
by polynomially reducing 3SAT to ALLDWS_CLASSICAL. Denote the clauses in B by 

Ci = yu v y i2 v y i3 

and construct a contextuality scenario Hb as follows. We would like the set of vertices to correspond 
to the set of literals together with 8 auxiliary variables for each clause in the sense that 

V{H B ) = f {v xl ,...v Xn ,v^ xl ,...,v^ x J u {v i!S }, 

where i = 1, . . . , m enumerates the clauses and s e {001, 010, 011, 100, 101, 110, 111} runs over the 
feasible truth value assignments to the literals in a clause. There are three kinds of edges, 

E(H B ) = f {{v Xj ,v^ Xj } : j = l,...,n} u {{v ifi01 , . . . , v i:111 ) : i = l,...,m} 

^ {{ v i,s,v { ^ )ytl ,v^ )y . 2 ,v^ )yi3 } : i = l,...,m; s = 001,. ..,111} 

where in the third type of edge, the negation — > appears if and only if s has a 1 at the corresponding 
position. 

The first type of edge guarantees that in any deterministic model, either v Xj or v^ x . gets the 
value 1, but not both; the second kind of edge guarantees that for every i, exactly one of the i^'s is 
1; the third type ensures that if p(vi^ s ) = 1, then the variables of Ci have precisely the values given 
by s. Therefore, the deterministic models on Hb correspond bijectively to the satisfying variable 
assignments of B. □ 

8.2. A semidefinite hierarchy converging to C(H). For some combinatorial optimization 
problems, one can find a contextuality scenario H, with polynomially many vertices and edges, 
such that the associated C(H) coincides with the usual polytope associated to the combinatorial 
optimization problem [Sch03]. We have illustrated how to do this for the case of 3SAT above, 
but similar reductions can be made also e.g. for coloring problems on graphs. The main idea is 
that the vertices of H are interpreted as boolean variables, and any formula of propositional logic 
can be encoded in terms of a collection of edges, possibly using some auxiliary variables. Then, 
our machinery automatically produces an associated linear as well a semidefinite relaxation of 
C(H), namely Q{H) and Qi(H), respectively. In fact, using a variant of Definition 6.1.2, where 
one additionally imposes that M VjW = M„.( v ) w for any permutation 7r, gives a hierarchy of 
semidefinite relaxations (C„(i7)) rli =N converging to C(H) [Las02]. Due to the high number of 
constraints, this converges even after a finite number of steps: C\v(h)\(H) = C(H) since any matrix 
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element M v w with v or w longer than V(H) is already determined by the other ones. However, 
while every C n for fixed n is defined by a semidefinite program of polynomial size, the semidcfinite 
program defining C\y\ is of exponential size. We have not implemented any of this since (hierarchies 
of) smaller specialized semidefinite relaxations exist in the literature [Anj04, Lau03]. 

8.3. Towards an inverse sandwich theorem? Now that we know the complexity of ALLOWS_GENERAL 
and ALLDWS_CLASSICAL, we move on to consider the quantum case, which may have some surprises 
to offer. 

Problem: ALLDWS.QUANTUM 
Input: A contextuality scenario H, 
Output: Q(H) ¥> 01 

This is equivalent to asking whether there exists an assignment of projections P v e B(TL) to 
each v e V(H) such that 2oee P v = 1 for all e e E(H). Here, the Hilbert space H can be taken to 
be separable infinite-dimensional without loss of generality, i.e. H = £ 2 (N). 

By definition, every set Q n (H) is given by a semidefinite program of polynomial size, and 
therefore determining whether Q n (H) can be done efficiently. One might suspect that this 
should give an algorithm for ALLOWS_QUANTUM thanks to the following observation: 

Lemma 8.3.1. Q(H) = if and only if Q n (H) = for some n e N. 

Proof. If Q n (H) = for some n, then clearly Q(H) = as well. To show the converse, 
assume Q(H) = 0, so that (") Q n (H) = 0. Since this is an intersection of closed subsets of the 
compact space Qi(-ff), we conclude that already finitely many of the Q n (H) have empty intersection. 
Because the Q n (H) form a decreasing sequence of sets, there has to be some n e N with Q n (H) = 
0. □ 

However, simply checking whether Q n (H) = for some n is a procedure that never terminates 
if Q(H) 7^ 0- Hence, in order to find an algorithm for ALLOWS_QUANTUM, we also need a procedure 
for detecting that Q(H) ^ if this is the case. 

One way to do this is to look in every Hilbert space dimension % = C d and see if there exists 
a quantum model in this dimension. For fixed d, this boils down to determining whether a certain 
system of polynomial equations and inequalities has a solution in IL Thanks to real quantifier 
elimination [Tar51], this is known to be decidable. Therefore, if a quantum model over some 
finite-dimensional Hilbert space exists, this procedure will eventually find it. 

By combining these two procedures, we have an algorithm for deciding ALLOWS_QUANTUM that 
works in all cases except in the case that H allows quantum models, but only on infinite-dimensional 
Hilbert spaces. 

Problem 8.3.2. Does such an H exist? 

In terms of [FNT12], we can also phrase this as follows. We construct the universal unital 
C*-algebra associated to a contextuality scenario H, 



C*(H) = f ( {P v : veV(H)} 



P v = Pi = P*yveV(H), J] P v = 1 Ve e E(H) 



If all of these C*-algebras are residually finite-dimensional, then Problem 8.3.2 has a negative 
answer and the above algorithm solves ALLOWS_QUANTUM, even if with very high complexity. 
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However, since Kirchberg's QWEP conjecture and Connes' embedding problem are equivalent 
to the residual finite-dimensionality of e.g. C*(-E?2,3,2) [Fril2, FNT12], we suspect that it is too 
much to hope for that all C*(H) are residually finite-dimensional. 

A different approach to ALLOWS_QUANTUM lies in recognizing that any instance of it can be 
reformulated as an 3 1 formula in quantum logic with signature (v, _L, on an infinite-dimensional 
Hilbert space. However, since the decidability status of quantum logic is also not known [Svo93, 
p. 69], this approach does not produce a terminating algorithm either. 

In conclusion, we do not know of any terminating algorithm that would solve ALLOWS_QUANTUM. 
In fact, we suspect the following: 

Conjecture 8.3.3. ALLDWS_QUANTUM is undecidable. 

Recall that Lovasz's sandwich theorem [Knu94] consists of the inequality 

a(G) *S 0(G) < X (G) (8.2) 

together with the observation that the outer two quantities are NP-hard to compute, while #(G) 
can be computed in polynomial time to arbitrary precision. We call 8.3.3 the inverse sandwich 
conjecture since the hypothetically uncomputable Q(H) lies between two computable sets, 

C(H) <= Q(H) E Q(H), 

So in contrast to the case of (8.2), here the real meat indeed lies in the middle of the sandwich. 

For the reasons discussed above, a proof of Conjecture 8.3.3 would have some interesting con- 
sequences for C*-algebra theory and also prove the undecidability of quantum logic 3 . Since these 
are difficult problems in themselves, proving Conjecture 8.3.3 will also be difficult. 

8.4. Other decision problems. There is a myriad of decision problems associated to con- 
textuality scenarios which one can study. Some further ones are: 

Problem: IS.CLASSICAL 

Input: A contextuality scenario H and p e Q(H) with p(v) e Q, 
Output: peC(H)? 

It is not difficult to see that this is in NP. Furthermore, it is actually NP-complete, since this 
is the case already for Bell scenarios [AII06]. 

Similarly, one can consider decision problems like IS_QUANTUM and IS_L0. So far, we have not 
considered these any further. Another natural decision problem is the question whether a given 
scenario allows nonclassical models or not: 
Problem: NDNCONTEXTUAL 
Input: A contextuality scenario H, 
Output: C(H) = Q(H)1 

We also do not know what the complexity of this problem is. We suspect that Theorem 2.4.3 
together with the techniques of [Eit94] might be helpful for answering this question. 

9. Examples 

In the previous sections, we have developed the abstract theory of contextuality scenarios in 
some detail. We have exemplified some of the concepts and results for the case of Bell scenarios. 
In particular, this illustrates how our formalism makes precise the intuition that nonlocality is a 



'More precisely, it would imply that the theory of Hilbert lattices in the signature (v, JL, 1) is not decidable. 
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special case of contextuality. Also, Appendix B provides a large class of examples of contextuality 
scenarios. 

Now it is time to look into other more concrete cases. The examples that have already been 
considered in the quantum foundations literature are too numerous to list. We focus on a few 
particularly appealing classes. 

9.1. Hypergraphs with subnormalization. Cabello, Severini and Winter [CSW10] base 
their approach also on hypergraphs H in a very similar spirit as we have done. The main difference 
is that they do not impose a normalization constraint 

2p(«) = l VeeE(H), 

vee 

but rather a subnormalization constraint 

1 VeeE(H), 

vee 

and similarly 2«ee P» ^ ■"■'M f° r projections in quantum representations. Our approach can "sim- 
ulate" this behavior by constructing a contextuality scenario H' which contains one additional 
no-detection event w e for each e e E(H), 

V(H') = f V(H) u {w e : e e E(H)}, E(H') = f {e u {w e } : e e E(H)} . 

Essentially by definition, the "classical models" of [CSW10] correspond to our C(H'), the "gener- 
alized models" to our (H'), and the "quantum models" to our Qi(H'). 

Lemma 9.1.1. In this situation, Q(H') = Qi(H'). 

Proof. Starting from p e Qi(H'), we would like to show that p e Q(H'). We use 6.3.1(d) as 
a criterion for membership in Q\{H'). This means that we have projections P v for all v e V{H) 
and P Wc for all e e E(H) such that 

ulv => P u 1 P v , vee => P v 1 P Wc , 

and p(v) = (^\P V \^) as well as p(w e ) = <*|P W J*>. We now define 

vee 

and claim that these, together with the P v and the state form a quantum model of p. First, 
due to Yjvee ^> ^ -"■W: the operator P^ is also a projection. Second, the completeness relation for 
edges in E(H') then holds by definition. Third, 

<*|i^.|*> = <*|*> - 2>|p„|*> = i - = pW- 

dec vee 

as claimed. Hence, p e Q(H'). □ 

In this sense, the set of quantum models of a scenario which arises in this way is particularly 
simple: the whole semidefinite hierarchy collapses to the first level! The advantage of this is that 
Proposition 6.3.2 on the relation between Q\ and the Lovasz number now also applies to quantum 
models. So, scenarios constructed in this way form a very special and well-behaved subclass of 
all contextuality scenarios. The n-circular hypergraphs that we consider next arise in this way. 
However, many of the more interesting contextuality scenarios — like Bell scenarios — are not of this 
form and therefore cannot be treated correctly in the CSW approach. This was already noticed 
in [CSW10]. 
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Figure 9. The 3-circular hypergraph A3. The labeling of the vertices corresponds 
to [FGR92, Ex. 2.13]. 

9.2. n-circular hypergraphs. The n-circular hypergraphs generalize the "pentagon" idea of 
Klyachko-Can-Binicioglu-Shumovsky [KCBS08] . 

Definition 9.2.1. For n > 3, the n-circular hypergraph A„ is given by 

V(A n ) = {v 1 ,...,v n ,w 1 ,---,w n }, 

E(A n ) = f {{v 1 ,w 1 ,v 2 }, ■ • ■ , {w„,w„,wi}} . 

So, A„ has 2n vertices and n edges as follows: if all vertices are evenly distributed on a circle 
in the order v%, Wi, . . . , V n , w n , V\, then every second triple of adjacent vertices, namely those of the 
form {vj, Wj, Vj+i}, is an edge (we write v n+ i = v±). The Wi can be interpreted as no-detection 
events as explained in the previous subsection. In particular, Lemma 9.1.1 applies, and we see that 
Q(A n ) = Qi(A„). 

Figure 9 displays A3, which can be metaphorically illustrated as a firefly box [Wil08]. It 
corresponds to the Wright triangle of [FGR92, Ex. 2.13] under the relabeling 

vi >—>■ a, w\ 1— ► 6, V2 i—*c, u>2 1—* d, W3 >— » e, W3 h> f. 

A5 is the "pentagon" scenario on which the KCBS inequality [KCBS08] is defined. It was first 
considered by Wright in 1978 [Wri78]. We now extend some of these results to arbitrary n. 

Proposition 9.2.2. Let n 3* 3. 

(a) dim(C(A„)) = dim(0(A„)) = n. 

(b) If n is even, then C(A„) = Q(A n ). 

(c) If n is odd, then C(A„) c CJ(A„) is determined by the inequality 

i 

There is one extreme point of Q(A n ) which violates this inequality. It is the probabilistic 
model p x e Q(A n ) with 

p x {vi) = \Mi, p x (w l ) = 0\fi. (9.2) 
In particular, Q(A n ) has one vertex more than C(A„). 
Proof. We consider all vertex indices modulo n, so that v n +i = Vi etc. 

(a) The equations imposed on the probabilities p{vi) and p{w{) by the normalization con- 
straints are just 

p(wi) = 1 - p(yi) - p(v l+1 ), (9.3) 
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which implies dim(C?(A„)) ^ n. The conclusion follows if we can produce n + 1 linearly 
independent deterministic models. This is simple: the set of models 

, s. def fl if i = j, 
Pj(Vi) = < 

10 otherwise, 

where the pj(wi) are uniquely determined thanks to (9.3) and j e {1, . . . , n}, is linearly 
independent. Furthermore, adding to this set the model po with po(vi) = for all i 
preserves linear independence. This is the desired collection of n + 1 linearly independent 
deterministic models. 

(b) C(A„) = ^(An) follows from Corollary 7.6.4, and ^S X {\ n ) = G(A n ) because the max- 
imal independent sets of Ort(A„) are precisely the edges on A„. In particular, while (9.2) 
is also a probabilistic model for even n, in this case it has to be a convex combination of 
deterministic models. 

Note that this argument has not used (a). 

(c) We apply Theorem 2.4.3 in combination with Corollary 7.6.4. Any induced subscenario 
Hw with C{H\y) ¥= G(Hw) needs to contain an induced (anti-)cycle in Ort(Hw)- This 
is possible only if W contains all Vi. If W also contains one or more of the Wi's, then 
Hw does not have a unique probabilistic model. Therefore, there can be at most one 
nonclassical extreme point of G(H), namely the one associated to the induced subscenario 
on W = {vi, ...,«„}. Now this Hw does indeed have a unique probabilistic model given 
by p x (vi) = \ 1 which yields (9.2) upon extension to A„. This proves that <?(A„) has p x 
as its sole nonclassical extreme point without ever using any inequalities. 

We now give an independent proof showing that (9.1) defines C(A„). Thanks to (9.3), 
it is enough to consider the values p(vi) only. Now the deterministic models correspond 
to the independent sets in the cycle graph C n ; upon identifying each vertex with the edge 
adjacent on its left, an independent set in C n gets identified with a set of edges in C n 
no two of which are adjacent at the same vertex, i.e. with a matching on C n . Now it is 
known [Sch03] that the polytope of all matchings on C n corresponds to 

n n — 1 

p(vi)>0, p(Vi) +p(v i+ i) sS 1, ^jPivi)^— ■ 

i=i z 

This is precisely the description of C(A„) that was to be proven. 

□ 



For n = 5, the set of classical models is bounded by the inequality J^ 5 i=1 p(vi) < 2, which 
is precisely the inequality which has been studied in [KCBS08]. Compare [AQB+12] for the 
characterization of classical models in a related scenario. 

Proposition 9.2.3. C(A 3 ) = ^(Aa) c Q(A 3 ). For all other n, C ^? 1 (A„) = £(A„). 

PROOF. Since {v±, V2, V3} is the only independent set in Ort(Aa) which is not an edge of A3, we 
find that < €S X { A 3 ) as a subset of ^(A 3 ) is given by imposing the inequality p(v 1) +p(v2) +p(vs) < 1. 
This is precisely the inequality that determines C(A 3 ) in 9.2.2(c). For n 5= 4, however, every 
independent set in Ort(A ra ) is of the form {?;,;, u>i, i.e. is itself an edge. □ 
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9.3. Antiprism scenarios. The antiprism scenarios are a variant of the circular hypergraph 
scenarios with some additional edges thrown in such that there is a symmetry exchanging the Uj 
with the Wi. Again, we consider all vertex indices modulo n. The antiprism scenarios are supposed 
to illustrate that an interesting looking hypergraph is not necessarily an interesting contextuality 
scenario. 

Definition 9.3.1. Let n > 3. The n-antiprism scenario AP n is 

def 

V(AP n ) ={vi,...,v n ,w 1 ,...,w n }, 

def 

E(AP n ) = {{v 1 ,w 1 ,v 2 }, ■ . ■ , {v n ,w n ,vi}} 

u {{wi,v 2 ,w 2 }, ...,{w n ,Vi,Wi}}. 

The idea behind the term "antiprism" is that one gets AP n by considering the antiprism 
polytope over an n-gon and defines a hypergraph AP n as given by the band of triangles winding 
itself around the polytope. 

Proposition 9.3.2. If n is divisible by 3, then C(AP n ) = Q(AP n ) is a 2- dimensional triangle. 
Otherwise, AP n has a unique probabilistic model which is not classical. 

Proof. We show that p(vi) and p(v2) determine all other probabilities p{vi) and p(wi) by 
induction on i: 

p(v i+ i) = 1 -p(vi) -p(wi), p{wi+\) = 1 -p{wi) -p(vi+i). 
In fact, this shows that 



P( v 3j + l) = p(w3j+2) = P{vi), p(v3j+2) = P(w 3 j) = p{v 2 ), p(v 3j ) = p(w 3] + 1 ) = 1 - p(vi) - p(v 2 ) ■ 
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Now if n is divisible by 3, then this is consistent upon "going around the cycle", so that Q(AP n ) 
can be identified with the triangle 

P(Vi)>0, p(v 2 )^0, p(v x ) +p(v 2 ) < 1. 

Clearly, the extreme points of this triangle are deterministic. 

If n is not divisible by 3, then the above recurrence relations imply that p(«i) = p{v%) = |, so 
that Q(AP n ) degenerates to a single point. C(AP n ) = since there is no deterministic model. □ 

We now give another example application of our methods. 
Proposition 9.3.3. Qi(AP 4 ) = 0, although ^S X {AP A ) = G(AP 4 ). 

Proof. Direct inspection shows that every maximal independent set in Ort(APi) is an edge, 
so that the unique probabilistic model given by p(vi) = p{wi) = i is in e ^S 1 {APi). 

It remains to show that the unique probabilistic model is not in Qi(AP 4 ). By Proposi- 
tion 6.3.2, this boils down to showing that |$(Ort(AP4)) > 1. Now Ort(AP n ) is the complement 
of the 4-antiprism graph P1 4 . Since P1 4 is vertex-symmetric, we deduce [Knu94, Thm. 25] that 
i?(P1 4 )0(Ort(AP 4 )) = 8. Now i?(P1J is known [BPT11] to equal 8 - 4\/2, so that 

as was to be shown. □ 

Also, note that the antiprism graph P1 4 which appears in this proof has also arisen as the 
non-orthogonality graph of possible events for the PR-box [Cabl2b,FSA + 12]. 

9.4. Matching scenarios. Let K m be the complete graph on m vertices. We define a con- 
textuality scenario Mat m as follows. V(Mat m ) is defined to be the set of edges of K m , so that 
|V(Mat m )| = "t™" 1 ) , The set of edges of Mat m itself is E(M&t m ) = {ei, . . . , e m }, where ej is the 
set of all edges in K m adjacent to the vertex j e K m . In the language of hypergraph theory [Vol09], 
Mat m is the dual of K m . For reasons that will become clear, we call it a matching scenario. 

Mat 5 coincides with Figure 2(b) from [PMMM05] . Using the CSW formalism [CSW10], it 
has also recently been studied in [Cabl2a]. These latter results can be transferred to our setting 
using the construction of Section 9.1, but they will live in the contextuality scenario Matg which 
contains additional vertices representing no-detection events. 

There are certain probabilistic models on Mat m which have a special form. By a half-integer 
matching, we mean a probabilistic model on Mat m in which each probability lies in {0, ^, 1} in 
such a way that the edges with positive probability define a decomposition of K m into cycles of 
odd length, where we regard an edge of probability 1 as a cycle of length 1. In particular, every 
perfect matching on K m can be regarded as a half-integer matching. 

Proposition 9.4.1. (a) The deterministic models on Mat m are the perfect matchings on 
K 

(b) C(Mat m ) is the perfect matching polytope [Sch03] on K m . In particular, C(Mat m ) ¥= if 
and only if m is even. 

(c) C?(Mat m ) is the fractional matching polytope. Its extreme points are precisely the half- 
integer matchings. 

(d) ^(Matm) is a polytope strictly intermediate between C(Mat m ) and (?(Mat m ) for m > 5. 
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Proof. (a) Using Remark 4.1, a deterministic model corresponds to a collection of edges 
in K m such that there is exactly one edge incident to each vertex. This is the definition 
of perfect matching. 

(b) C(Mat m ) is defined to be the convex hull of the deterministic models, and likewise the 
perfect matching polytope is defined to be the convex hull of the perfect matchings, in the 
same ambient space. 

(c) The inequalities defining 0(Mat m ) are precisely those defining the standard linear relax- 
ation of the perfect matching polytope. Its extreme points are known to be the half-integer 
matchings [Sch03]. This can also be proven using Theorem 2.4.3. 

(d) For m ^ 5, there are two kinds of maximal independent sets in Ort(Mat m ): first, the edges 
of Mat m themselves; second, all triples of edges in K m that form a triangle. The latter 
impose the additional constraint that the sum of the edge weights in a triangle should not 
exceed 1. Therefore, the half-integer matchings with cycles of length 3 do not belong to 

(Mat m ), which is hence a polytope strictly contained in (?(Mat m ). On the other hand, 
(Mat m ) still contains half-integer matchings with odd cycles of length > 5, which are 
not in C(Mat m ). 

□ 

Nothing of this is specific to K m and can likewise be done starting with any other graph. 
By definition, Ort(Mat m ) is the Kneser graph KG m> 2 [Lov78]. In particular, Ort(Matg) is the 
Petersen graph. But the curiosities do not end here: 

Corollary 9.4.2. ^^(Mats), when scaled by a factor of 2, is the symmetric traveling sales- 
man polytope STSP(5) [GP79,NP01]. 

Proof. Since 5 is odd, K§ has no perfect matchings. Therefore, every half-integer matching 
on K m is a disjoint union of cycles of edges with weight |. Now it follows from (d) that every 
extremal vertex of < ^ ,1 (Mat m ) is a cycle of length 5 with weight | on each edge, as was to be 
shown. □ 
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Appendix A. Background on graph theory 



This section reviews standard material on the invariants of graphs which are of relevance to the 
main text, first for unweighted and then for weighted graphs. As far as we know, the conjectures 
in Sections A. 2 and A. 4 are new. 

For us, a graph is an undirected simple graph without isolated vertices. When G is a graph, 
we denote its set of vertices by V(G). For u, v e V(G), we write u ~g v whenever u and v share 
an edge (are adjacent) in G. Usually the graph is clear, and then we simply write u ~ v. 

There are many ways to take products of graphs [IKOO]. For us, the relevant one is this: 

Definition A. 0.1. Let G\ and G 2 be graphs. Their strong product is the graph Gi(x]Gi with 



For n e N, we write G^ n for the n-fold strong product of G with itself. 

A.l. Relevant invariants of unweighted graphs. Since we will later consider graphs 
equipped with edge weights, we also use the term "unweighted graph" when working with plain 
graphs in order to emphasize the distinction. 

Recall that an independent set in a graph G is a subset / E V"(G) such that no two vertices 
in / share an edge. / is an independent set in G if and only if it is a clique in the complement 
graph G. An independent set / is maximal if there is no other independent set V c V(G) with 
I £ I'. The independence number a(G) is the largest number of elements of any independent 
set in G (sometimes also called the stability number). 

Lemma A. 1.1. Let l\ c G\ and Li E G2 be maximal independent sets. Then I\ x I2 E Gi\x\G 2 
is also a maximal independent set. 

PROOF. The definition of adjacency in G\ [x] G2 implies immediately that L\ x L 2 is also an 
independent set in G\ \x\ G2. 

We now show maximality of / = I\ [x] L 2 . For any v = (vi, V2) 6 V(G\ [x] Gi)\L, the following 
cases are possible: 

(a) Case v\ ^ L\ and i>2 ^ L2'. by maximality of Ji and J2, there are u\ e L\ with ui ~ v\ and 
u 2 e L 2 with u-i ~ V2- Hence (ui,u 2 ) ~ (vi,V2). 

(b) Case v\ L\ and v 2 e 1%: by maximality of I±, there is u\ 6 L\ with u\ ~ v%. Hence 

e I and (mi,w 2 ) ~ (vi,v 2 ). 

(c) Case V\ e L\ and v 2 $ I 2 - Similar to the previous case. 

In either case, the conclusion is that v is adjacent to some vertex in J, and hence / is a maximal 
independent set. □ 

Lemma A. 1.2. 



U(Gi0G 2 ) d = V(G\) x V(G 2 ) 



and (ui,u 2 ) ~ (vi,v 2 ) whenever 



(til ~ Vi A 1t 2 ~ V 2 ) V {U\ ' ' V\ A U 2 — V 2 ) V {U\ = V\ A U 2 ~ U 2 ) . 



a(GiSG 2 ) > a(Gi)a(G 2 ) 



Proof. Lemma A.l.l. 



□ 



In particular, this implies 



a 



(G s(n+m) ) > a(G s ")a(G Em ) Vm,n e N. 



(A.l) 



46 TOBIAS FRITZ, ANTHONY LEVERRIER, AND ANA BELEN SAINZ 



Remark A. 1.3. Despite this inequality, the sequence ( \l a(G^ n ) ) is not monotonically 

V v / neN 

increasing in general; this happens, for example, for the pentagon graph (or 5-cycle) O, for which 

o(O) = 2, a(Q® 2 ) = 5, a(Q m ) = 10. 

See [AL06] for more results on the behavior of ( ^Ja{0^ n ) ) 

V / neN 

In combination with Fekete's Lemma [Fek23], (A.l) guarantees the existence of the following 
limit: 

Definition A. 1.4 (Shannon capacity). The (unweighted) Shannon capacity 0(G) is 

0(G) = f lim Va(G &1 )- (A.2) 

n— »oo 

Intuitively, 6(G) is an asymptotic version of the independence number a(G). 

This number can be interpreted in terms of information theory as follows. When V{G) is the 
input alphabet of a classical communication channel such that u ~ v if and only if u and v have non- 
trivial probability to produce the same channel output, then this channel can asymptotically transfer 
log 2 0(G) bits of perfect information per channel use. G is then called the confusability graph 
of the channel. This is the context in which was originally introduced by Shannon [Sha56]. The 
use of the logarithm here differs from the standard information-theoretic definitions of capacities, 
which usually already include it in their definition. 

Not much is known about the values of for particular graphs, not even 0(G7), where G7 is 
the 7-cycle [CGR03]. 

For graphs G\ and G2, we write G\ + G2 for their disjoint union, which is again a graph. 
Lemma A.1.5 ([Sha56]). (a) 

0(G X + G 2 ) > 0(d) + 0(G 2 ). (A.3) 

0>) 

0(Gi M G 2 ) =s 0(Gi)0(G 2 ). (A.4) 

Finding examples in which these inequalities are not tight is surprisingly difficult. The following 
results are due to Haemers and Alon. 

Theorem A.l. 6 ([Hae79, Alo98]). There exist graphs Gi and G 2 such that 

(a) 0(Gi0G 2 ) > 0(Gi)0(G 2 ) . 

(b) 0(G 1 + G 2 )>0(G 1 ) + 0(G 2 ) , 

Of particular relevance for our considerations in the main text are graphs whose independence 
number coincides with their Shannon capacity: 

Definition A. 1.7. A graph G is single-shot if a{G) = 0(G). 

Single-shot graphs are the Class 1 graphs of Berge [Ber97] 4 . G is single-shot precisely when the 
sequence ( ^/a(G^ n ) ) is constant. Our terminology is motivated by the information-theoretic 
interpretation alluded to above: if a communication channel has a confusability graph which is 
single-shot, then there exists a zero-error code for this channel which operates on the single-shot 
level. 



'We thank Andras Salamon for pointing out this reference. 
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Due to standard results [Knu94], every perfect graph is single-shot. The Petersen graph is 
not perfect, but nevertheless single-shot since its Lovasz number (see below) coincides with its 
independence number [Knu94, p. 31]. 

Definition A. 1.8 (Lovasz number [Lov79]). (a) An orthonormal labeling of G is an 

assignment v >— > \^> v ) of a unit vector \ip v ) e IflJ^ '! to every v e V(G) such that u t 6 v 
and u ¥= v implies |t/\i) _L |^>„). 
(b) The Lovasz number ^(G) is 

$(G) d = min max ■ 



i*>,iv>„> |<*|^„>| 2 

where e Kl^( G )l ranges over all unit vectors and (\tp v y)veV(G) over all orthonormal 
labelings. 

There are several other equivalent definitions of #(G) commonly used [Lov79]. Multiplicativity 
of i9 is one of its many useful properties: 

Proposition A. 1.9 ([Lov79]). 

tf(Gi0G 2 ) = 0(Gi)tf(G a ). 
Definition A. 1.10. The fractional packing number a*(G) is 

* / def \ 1 

a (G) = max > q v 

q 

v 

where q : V{G) — > [0, 1] ranges over all vertex weighings satisfying XlueC 5u ^= 1 f or a ^ cliques 
C c V(G). 

The fractional packing number can be regarded as the linear relaxation of the independence 
number. For this reason, it is sometimes also called fractional independence number. 

Proposition A. 1.11 ([Lov79]). 

a(G) 6(G) < 0(G) < a*(G). 

In general, none of these inequalities is an equality. This is most difficult to see for 8(G) < $(G), 
for which it was shown by Haemers [Hae79] after having been posed as an open problem by 
Lovasz [Lov79]. 

A. 2. Main conjectures for unweighted graphs. Single-shot graphs are of particular rel- 
evance for the material in Section 7, and therefore we would like to understand some of their 
properties. 

In the following, we routinely use the material of the previous subsection without explicit 
reference. 

Conjecture A. 2.1. For all single-shot graphs G\, G 2 , 

(a) a{G 1 mG 2 ) = a{G 1 )a{G 2 ); 

(b) 0(Gi M G 2 ) = e(G 1 )6(G 2 ); 

(c) Gi + G 2 is also single-shot, so that 6(Gi + G 2 ) = 6(G X ) + 6(G 2 ). 



48 



TOBIAS FRITZ, ANTHONY LEVERRIER, AND ANA BELEN SAINZ 



Concerning (c), we have 

a(Gi + G 2 ) = a(Gi) + a(G 2 ) = 6(Gi) + 9(G 2 ) sS 9(Gi + G 2 ), 

so that Gi + G2 is also single-shot if and only if 8(Gi + G2) = O(Gi) + 9(G2). However, it is not 
clear whether (b) is similarly related to the question whether Gi [x] G2 is also single-shot. 

We are far from being able to answer any of these conjectures. Using Proposition A. 3. 10, it is 
clear that (a) holds in the class of those graphs which satisfy a(G) = Q(G) = $(G), and (b) and (c) 
then follow from the upcoming Proposition A. 2. 2; however, it was shown by Haemers [Hae81] that 
there are graphs G with a(G) = 6(G) < J?(G). Moreover, the other examples of Haemers [Hae79] 
or Alon [Alo98] seem to have no bearing on the above conjectures, although Haemers' bound might 
also turn out to be useful for finding counterexamples. We also hope that our Theorem 7.5.2 might 
be useful for finding counterexamples since it allows to ponder a large class of specific cases in terms 
of physical intuition. 

Despite the difficulty of these conjectures, it is relatively simple to show some interrelations 
between them: 

Proposition A. 2. 2. Conjectures (a) and (b) are equivalent and imply (c). 
Proof. Assuming (b), we get 

a(G 1 )a(G 2 ) < a{G x G 2 ) < ®{G X G 2 ) = e(Gi)0(G 2 ) = a(G 1 )a(G 2 ) ! 
which proves (a). Conversely, assuming (a) gives 

e(Gr G 2 ) = lim f/a(Gf" E Gf 1 ) = lim ?/a(Gf ")a(Gf n ) = 6(G 1 )6(G 2 ), 

n V n V 

since if Gi and G2 are single-shot, then so are Gf n and Gf™ for any n e N, and (a) then also 
applies to these graphs. 

Concerning (c), we use the fact that distributes over +, so that 



(Gi + G 2 ) En 



k=0 



"1 ^2 



-fe) 



This gives 



6(Gi + G 2 ) = lim 

n \ 



2 

vfe=0 



n k JGf k mGf ( - n - k) 



lim 



\ k = 



a [Gf k ®G. 



H(n-fc) 



Applying (a) to the single-shot graphs Gf fc and G^ n k ^ evaluates this to 



e(G! + G 2 ) 



a(G 1 ) k a(G 2 ) 



n — k 



a(Gi) + a(G 2 ) = e(Gi) + e(G 2 ). 



□ 



A. 3. Relevant invariants of weighted graphs. We now generalize the definitions to graphs 
equipped with vertex weights, i.e. to graphs G equipped with a weight function p : V(G) —* R+. 
We omit a proof whenever it is completely analogous to the unweighted case. Weight functions 
Pi : V(G\) — > M+ and p 2 '■ V^(G2) —* M+ multiply to a weight function 



Pi®P2 ■ V(Gi SG 2 ) 



(vi,v 2 ) 1-* pi(ui)p 2 (« 2 ). 
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In this way, p®" is a weight function on G^™. Similarly, there is an obvious weight function p\ +P2 
defined on the disjoint union G\ + G 2 ■ When p\ and p 2 are defined on the same graph, we use the 
same notation p\ + P2 for the pointwise sum; despite this ambiguous notation, the meaning will 
always be clear from the context. 

Definition A. 3.1. Let G be a graph equipped with vertex weights p. 

(a) The weighted independence number a(G,p) is the largest total weight of an indepen- 
dent set in G. 

(b) The weighted Lovdsz number -d(G,p) is 

def . p(v) 
mm max 

i*>,iv»> «v K*IVvl 

where |\&) e K.I I ranges over all unit vectors and (|"0u))«eV(G) over all orthonormal 
labelings. 

(c) The weighted Shannon capacity 0(G,p) is 



d(G, p) = } i T min ^ max W ^ L \, 2 (A.5) 



Q(G,p) = lim V"(G &l ,P® n ). ( A -6) 

n— >oo 

(d) The weighted fractional packing number a*(G,p) is 

a*(G,p) = f max ^ P( v )l( v )- 



q veV 



where q : V{G) — > M + ranges over all vertex weights satisfying XjueC l( v ) ^ ^ f or a ^ 
cliques C c7(G), ~" ~ 

The fraction in (A.5) uses the convention § = 0. See [Knu94] for several equivalent definitions 
oftf(G,p). 

All of these quantities specialize to their unweighted counterparts by choosing unit weights 
P = 1. 

Proposition A. 3. 2. Let C1(G) denote the set of all cliques on G. 

a*(G,p) = min J] x(C) (A.7) 

CeCl(G) 

where x ranges over all functions x : C1(G) — > K+ with p(v) < Y^cbv x (C) W 

Proof. Linear programming duality. □ 
Lemma A. 3. 3. (a) 

0(Gi + G 2 ,pi + pa) > 9(Gi s pi) + 6(G 2 ,p 2 ). (A.8) 

6(Gi G 2)P1 ®p 2 ) > e(G 1 ,p 1 )6(G 2 ,p 2 ). (A.9) 
PROOF. As in the unweighted case [Sha56]. □ 

Since these inequalities are not tight in general in the unweighted case [Hae79, Alo98], neither 
can they be tight in the weighted case. One might expect simpler counterexamples to exist in the 
weighted case, but we have not been successful in finding any. 
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When pi, P2 are weight functions on the same graph G, superadditivity no longer holds for 
trivial reasons: e.g. for G = Ki, the graph on two adjacent vertices {it, v} with p\ = l u and P2 = 1«, 
we have 

1 = 0(G, Pl + pa) < 0(G,pi) + 6(G,p 2 ) = 2. 

Many statements about these invariants can be reduced to statements about their unweighted 
counterparts using a technique we call blow-up. Applying this technique requires the vertex weights 
to be rational. Therefore, we begin by proving a continuity result which allows us to reduce many 
problems to the case of rational weights. 

Lemma A. 3. 4. Let (G,p) be a weighted graph and K m the empty graph on m vertices with 
weights q. Then, 

X{G + K m ,p + q) = X(G,p)+ J] q(v) (A.10) 

vEV(K m ) 

for all four invariants X e {a, 0, ■&, a*}. 

PROOF. This is trivial for X = a. For X = it is a special case of [Knu94, eq. (18.2)]. For 
X = a* , it follows from an application of Proposition A. 3. 2. It remains to treat the case X = 0. 

Since Q(K m ,q) = J^ v qv, the inequality is an instance of superadditivity (A. 8) of 0. To 
also show we choose any independent set / in (G + K m )^ n and partition it into a disjoint 

union 

/= u ft 

se{0,l}™ 

where each Ig contains only vertices (i>i, . . . , v n ) with Vi e V(G) if Sj = and Vi e V{K m ) if Sj = 1. 
Then upon dropping all components i with = 1, such an Ig becomes an independent set in some 
G^ fe . In this way, we get the estimate 

a((G+K m )»,(p+q)® n ) < £ (^a{G®\p® k ) 

tv n — k / 
2>J -fe(G,p)+2* 

which implies the desired inequality upon taking the n-th root and then n — > 00. □ 

Lemma A. 3. 5. Lei (G,p) be a weighted graph, v e G a vertex, q e M + and X e {a, 0, i?, a*}. 
TTien 

A(G,p) < A(G,p + < A(G,p) + q. (A.ll) 

Proof. The first inequality is clear since X{G,p) is a non-decreasing function of p. 

Since adding additional edges cannot increase the value of X and two vertices with exactly the 
same neighbors can be identified to one vertex by adding the weights (for X = fl, see [Knu94, 
Lemma 16]), we have X(G,p + qt v ) < X(G + K\,p + q). Now the second inequality follows from 
the previous lemma with m = 1. □ 

This lemma directly gives the desired continuity result: 

Corollary A. 3. 6. For any graph G and any X e {a, 0,$,a*}, the function p >— > X{G,p) is 
continuous. 




A COMBINATORIAL APPROACH TO NONLOCALITY AND CONTEXTUALITY 



51 



We can now introduce the blow-up technique which can be used to translate problems from the 
weighted case to the unweighted setting. 

Definition A. 3. 7. Let (G,p) be a weighted graph with p(v) e N Vw. Then the blow-up 
Blup(G,p) is the unweighted graph with vertex set 

{(v,k) : veG,ke {1, . . . ,p(v)} }, 

where we take (v, k) and (y' , k') to be adjacent if and only if v ~ v' in G. 

Intuitively speaking, Blup(G,p) is constructed by replacing every vertex v in G by p(v) many 
non-adjacent vertices. In particular, if p(v) = 0, the vertex v simply gets removed from the graph. 
Blow-ups have also been considered in [Knu94, Sec. 16], although not under that name. 

Lemma A. 3. 8. For vertex weights in N, 

(a) Blup(G! + G 2 ,Pi +P2) = Blup(G 1)Pl ) + Blup(G 2 ,p 2 ). 

(b) Blup(GiSG 2 ,pi®j5 2 ) = Blup(Gi,pi)SBlup(G 2 ,p 2 ),- 

(c) X(Bhip(G,p)) = X(G,p) for every Xe {a,9,z9,a*}. 

Proof. Clear. □ 
We can now already reap some of the simpler benefits of the blow-up technique: 
Corollary A. 3. 9. 

a(G,p) < Q(G,p) < 0(G,p) < a*(G,p). 
Proof. Combine Lemma A. 3. 8 with Proposition A. 1.11. □ 
Corollary A.3.10 ([Knu94, (20.5)]). 

^GiSGa.piOpa) = <?(Gi,pi)i?(G 2j pa) 
Proof. Combine Lemma A. 3. 8 with Proposition A. 1.9. □ 
Corollary A. 3. 11. 

a(Gi SG 2 ,pi ®p 2 ) > a(G 1 ,p 1 )a(G 2 ,P2) 

PROOF. Combine Lemma A.3.8 with Lemma A.1.2. □ 

A. 4. Main conjectures for weighted graphs. We now extend the material of Section A. 2 
to the weighted case. 

Definition A. 4.1. A weighted graph (G,p) is single-shot if a(G,p) = Q(G,p). 

Conjecture A. 4. 2. For all weighted single-shot graphs (Gi,pi), (G 2 ,p 2 ), 

(a) a(GiSG 2 ,pi ®p 2 ) = a(G 1 ,p 2 )a(G 2 ,P2) ; 

(b) 9(GiSG 2 ,pi®p 2 ) = e(Gi,pi)9(G 2 ,p 2 ); 

(c) (Gi + G 2 ,pi +p 2 ) is also single-shot, so that 0(Gi + G 2 ,pi +p 2 ) = Q(Gi,pi) + Q(G 2 ,p 2 ). 

Applying the blow-up technique in order to relate these conjectures to their unweighted coun- 
terparts requires some further preparation. 

Lemma A. 4. 3. Let {G,p) be a weighted single-shot graph. Then for every e > there exist 
weights p'(v) e Q with \p(v) — p'(v)\ < e and such that (G,p') is still single-shot with a(G,p') = 
a(G,p). 
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Proof. Let p max be the largest weight of a vertex in G, and fix 5 > such that 26 ■ p max < e. 
Fix any independent set v±, . . . ,v n of maximal weight and choose rational numbers p'(wj) e ((1 — 
<5)p(«i), (1 + S)p(vi)) such that = Tj%Pi = a (G,p). Furthermore, for vertices w not in that 

set, choose arbitrary rational numbers p'(w) e ((1 — 2S)p(w),(l — S)p(w)) . Then 2(5 • p max e 
guarantees \p(v) — p'(v)\ < e for all v e V(G). 

def 

Now we claim that a(G,p') = Q(G,p') = a(G,p). Upon setting = p'(uj) — (1 — 5)p(vi), we 
estimate 

a(G, j/) < Q(G,p') < 6 (G, (1 - + J] q u 

i 

where the last inequality follows from Lemma A. 3. 4 and the fact that transporting some weight 
from some vertex to a new isolated vertex cannot decrease the capacity. Since J]j 1i = a (G,p) ~ 
(1 — S)a(G,p), we can further evaluate this to 

a(G,p') < Q(G, P ') < (1 - <5)6(G,p) + fa(G,p) = a(G,p). 

On the other hand, we have constructed p' in such a way that there is an independent set of weight 
a(G,p), and hence all these inequalities are actually equalities. □ 

Theorem AAA. Each one of the Conjectures A. 4-2 is equivalent to its sibling in A. 2.1. In 
particular. Conjectures A. 4.2(a) and A. 4.2(b) are equivalent and imply A. 4. 2(c). 

Proof. By taking all weights to be 1, each statement in Conjecture A. 2.1 becomes a special 
case of the corresponding statement in Conjecture A. 4. 2. Therefore, the bulk of the proof lies in 
showing the converse implications. 

We illustrate how to translate any potential counterexample (Gi MG2,Pi <8> P2) to Conjec- 
ture A. 4. 2(a) into a counterexample to Conjecture A. 2. 1(a). To this end, we first apply Lemma A. 4. 3 
to both (Gj,Pj) with a certain e > and obtain (Gj,p'j). Then, the difference (p[ ®p' 2 )(vi, v%) — 
(pi (8)p2)(vii V2) can be bounded by a certain function of e and the a(Gj ,PjYs. In particular, one 
can choose e so small that 

a(Gi mG 2 ,Pi ®p' 2 ) > a(G 1 ,pi)a(G 2 ,P2) = a(G 1 ,p' 1 )a(G 1 ,p' 2 ). 

After multiplying both weight functions by the respective common denominator, they become 
integer- valued, and the claim then follows from Lemma A. 3. 8. 

Conjecture A. 2. 1(b) follows from A. 4. 2(b) by analogous reasoning, now also relying on Proposi- 
tion A. 3. 6 for X = 9. The same holds for the implication of Conjecture A. 2. 1(c) from A. 4. 2(c). □ 

Conjecture A. 4. 5. Let G be a graph with weight functions p\ and p 2 such that both (G,pi) 
and (G,p 2 ) are single-shot. Then 

e(G, Pl + P2 ) ^ q(g, Pi ) + e(G, P2 ). 

Proposition A. 4. 6. Conjecture A. 4-5 is equivalent to A. 2. 1(c). 

Proof. We prove equivalence to Conjecture A. 4. 2(c), and then the claim follows from Theo- 
rem A.4.4. 

First, Conjecture A. 4. 2(c) easily follows from A. 4. 5 by taking G = Gi + G 2 and using (A. 8). 
Hence it remains to prove that A. 4. 2(c) implies A. 4. 5. To see this, consider the graph G with 
vertex set 

{{v,k) : we G, fee {1,2}} 
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(X,M) 




O = {0,l} 



Figure 11. The CHSH scenario as a marginal scenario. We now draw the vertices 
as squares in order to indicate that the interpretation differs from the one of all 
other illustrations of hypergraphs in this paper. 

where we take (v,k) and (v',k') to be adjacent if and only if v ~ v' in G. We equip G' with 
the weight function p' given by p'(v, 1) = pi(v) and p'(v,2) = P2(u). Then, by construction, 

e(G',p') = e(G, Pl + p 2 ). 

Now this (G',p') is a union of (G,pi) and (G,p 2 ), which it contains as induced weighted 
subgraphs. Since removing edges cannot decrease O, we get 

&{G,pi + pa) = Q(G' )P ') < 9(G + G, Pl + p 2 ) = Q(G, Pl ) + Q(G,p 2 ). 



as was to be shown. 



□ 



Appendix B. Relation to the observable-based approach 

The observable-based approach to quantum contextuality and nonlocality has first been stud- 
ied explicitly by Abramsky and Brandcnburgcr [AB11]. It was used much earlier in a different 
mathematical context by Vorob'ev [Vor62]. See also [LSW11, FC12], where similar definitions 
have been used. In this section, our goal is to show how the observable-based approach can be 
embedded into our formalism. A converse construction should be possible upon augmenting the 
observable-based approach by additional constraints as in [AB11, Sec. 7]. In this sense, the two 
formalism are essentially equivalent. We believe that both approaches have their merits; for ex- 
ample, in both cases the relation to sophisticated mathematical methods can be exploited. In the 
observable-based approach, this has been done in [AMSB11]; for the hypergraph-based approach, 
this has been started in [CSW10] and further developed in this paper. 

B.l. Definitions for the observable-based approach. The following definition blends the 
terminology of [AB11] with the one of [FC12]. 

Definition B.l. 1. A marginal scenario (X , O , M) is a finite set X = {A\, . . . ,A n }, the ele- 
ments of which we call observables, together with a finite set O of outcomes and a measurement 
cover A4, which is a family of subsets A4 E 2 X such that 

(a) every element of X occurs in some C: UceA4 C = X . 

(b) M is an anti-chain: C,C' 6 M, C E C' => C = C' . 
The C e M. are called measurement contexts. 

From the mathematical point of view, the maximal sets of compatible observables are a hyper- 
graph precisely as in Definition 2.2.1, but the physical interpretation is quite different. The subsets 
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in M. represent the maximal sets of jointly measurable observables. See Figure 11 for an example 
in which the four pairs 

{Ax, Si}, {A 1 ,B 2 }, {A 2 ,B 1 }, {A 2 ,B 2 } 

are jointly measurable, but no other pairs or triples of observables are jointly measurable. 

As is common practice with many other mathematical structures, we denote a marginal scenario 
(X, O, M.) simply by X, at least when O and Ai are clear from the context. 

As noted in [AB11], it is not a substantial restriction to assume that all observables take 
values in the same set of outcomes O. We assume this mainly for convenience of notation and note 
that all of our considerations and results can easily be extended to the general case in which each 
measurement Ae X takes values in an associated finite set of outcomes O a depending on A. 

In the following, we want to consider measurements of compatible observables which are con- 
ducted in a certain temporal order. Assume that we have already measured some observable A e X: 
then is it possible to define a marginal scenario which encodes all the possibilities for subsequent 
measurements? The following notion achieves this: 

Definition B.1.2. Given an observable A e X , the induced marginal scenario X{A} is the 
marginal scenario having observables 

X{A} = {A' e X | A' * A, 3C e M s.t. {A, A'} c C} 

and measurement contexts defined to be the restrictions of those C e A4 with Ae C down to X{A) . 

By definition, any A{yl} has a smaller number of observables than the original X. In particular, 
iterating this construction by taking an induced marginal scenario of an induced marginal scenario 
etc., one eventually ends up with an empty scenario, and the process terminates. 

We make the following recursive definition: 

Definition B.1.3. A measurement protocol T on a marginal scenario X is 

(a) T=0 ifX = 0; 

(b) otherwise, T = (A,f), where A £ X is an observable and f : O — > MP(A{yl}) is a 
function, where MP(X{A}) is the set of all measurement protocols on the scenario X{A}. 

Intuitively, a measurement protocol consists of a choice of observable and an assignment of a 
new measurement protocol to each outcome of the observable, where the new measurement protocol 
lives on the induced marginal scenario. 

Upon unraveling the recursive structure of this definition, one finds that a measurement pro- 
tocol specifies sequences of measurements which can be applied to the system, where the choices of 
subsequent measurements / are allowed to depend on the outcomes of the earlier ones. These mea- 
surement sequences have the additional property that all measurements in a sequence are compatible 
and that no measurement can occur twice in the same sequence. We use the letter "T" to indicate 
the tree-like appearance of this structure. Note that every measurement sequence is automatically 
maximal in the sense that it contains all observables of a certain measurement context. 

The set of outcomes Out(T) of a measurement protocol T is also defined recursively: if T = 0, 
then there is only a single outcome which we denote by "*", so that Out(0) = {*}. Otherwise, we 
have T = (A, f) and put 

Out(T) = f {(a, a) : aeO, ae Out(/(a)) } . 

In this way, an element of Out(T) corresponds to a measurement sequence in T together with an 
associated sequence of outcomes for these measurements such that applying the protocol to any 
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outcome in the sequence results in the following measurement, except if the outcome is the last one 
in the sequence. 

Definition B.1.4. The contextuality scenario H[X] associated to a marginal scenario X has 
vertices 

V(H[X]) d = f { s eO c : CeM} 

and edges 

E(H[X]) = f {Out(T) : TeMP(X)}. 

We write P for an empirical model on X [AB11]. This means that for each C e M., Pc is 
a probability distribution over O , such that the sheaf condition holds: 

Po\anC = Pc\cr,c> VC, C" e (B.l) 
where Pc\Cn.c stands for the marginal distribution of Pc associated to the observables in C n C. 
For an assigment of outcomes s e O c , Pc(s) is to be thought of as the probability of obtaining the 
joint outcome s when jointly measuring all observables in C. The sheaf condition is a generalization 
of the no-signaling condition. 

B.2. Correspondence to our approach. To an empirical model P we associate a proba- 
bilistic model on H [X] by setting, for each CeM. and each s e 1 , 

p( S :C^O)= f P c (s). (B.2) 

It will need to be verified that this actually is a probabilistic model, i.e. that these probabilities are 
suitably normalized for every edge in 

Conversely, given a probabilistic model p on we claim that (B.2) defines an empirical 

model P on X. 

Theorem B.2.1. This defines a linear bijection between empirical models on X and probabilistic 
models on H [X] . 

This bijective correspondence generalizes Proposition 3.3.2 and the related [FSA+12, Lemma 1]. 

PROOF. We first verify that (B.2) turns an empirical model P into a probabilistic model p. It 
needs to be shown that 

seOut(T) 

for any measurement protocol T. In order to prove this, we introduce the notion of post- 
measurement empirical model. Suppose that a measurement has resulted in an outcome a e O 
for an observable A e X. Then for the subsequent measurements in the scenario we expect 

the posterior probabilities 

ppoBt(o)/ s _ P c(s) 

C [) ~P { A } (a)- 

We now use induction on the size of X in order to prove (B.3). The base case is X = 0, in which 
there is nothing to prove. For the induction step, we decompose T = (A, f) and use the induction 
assumption on each pp° st ( a ) for those a e O with P/^\(a) ¥= 0. Then 

S ^c(s)=E J P [A} (a)P c ost(a \*)=Y,P { A ] (a) = l, 

seOut(T) a aeOut(/(a)) a 

as was to be shown. 
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Conversely, we need to prove that if p is a probabilistic model on if [X], then the associated P 
is an empirical model, i.e. that it satisfies (B.l). It is sufficient to consider the case C n C" # 0, for 
otherwise (B.l) is vacuous. Let so e CnC be an arbitrary joint outcome of the observables CnC". 
Then we consider a measurement protocol T given by conducting the measurements CnC", and then 
conducting the measurements C\C if the joint outcome was sq , and conducting the measurements 
C'\C otherwise. Then the normalization equation associated to this measurement protocol reads 

£ P ( S „ut)+ 2 £ p( a uf) = l. 

te o c \ c ' s ^seO c " c ' t'eO c '\ c 

Comparing this with the normalization equation associated to the measurement protocol which 
simply measures all observables in C and outputs their joint outcome, 

se O c " c ' t'€O c '\ c 

gives, upon splitting the latter equation into the s = sq part and the s ¥= sq part, 

p( s o u *) = X! p ( s ° u 

te C>c\c' t / EO c'\c 
as was to be shown. □ 

There are analogous correspondence theorems for quantum models and classical models. Since 
these are perfectly analogous both in the statement and in the proof, we do not discuss them further. 



A COMBINATORIAL APPROACH TO NONLOCALITY AND CONTEXTUALITY 



57 



References 

[ABll] Samson Abramsky and Adam Brandcnburger, The sheaf-theoretic structure of non- locality and contex- 
tuality, New Journal of Physics 13 (2011), no. 11, 113036. f3, 5, 7, 17, 53, 54, 55 

[AII06] David Avis, Hiroshi Imai, and Tsuyoshi Ito, On the relationship between convex bodies related to corre- 
lation experiments with dichotomic observables, J. Phys. A 39 (2006), 11283. |38 

[AL06] Noga Alon and Eyal Lubetzky, The Shannon capacity of a graph and the independence numbers of its 
powers, IEEE Trans. Inf. Theory 52 (2006), no. 5. |46 

[Alo98] Noga Alon, The Shannon capacity of a union, Combinatorica 18 (1998), no. 3, 301-310. ]46, 48, 49 
[AMSB11] Samson Abramsky, Shane Mansfield, and Rui Soares Barbosa, The Cohomology of Non-Locality and 
Contextuality, Proceedings 8th International Workshop on Quantum Physics and Logic (Nijmegen, 
2011), 2011. t53 

[Anj04] Miguel F. Anjos, On semidefinite programming relaxations for the satisfiability problem, Math. Methods 

Oper. Res. 60 (2004), no. 3, 349-367. f37 
[AQB+12] Mateus Araujo, Marco Tulio Quintino, Costantino Budroni, Marcelo Terra Cunha, and Adan Cabello, 

Complete characterization of the n-cycle noncontextual poly tope, 2012. arXiv:1206.3212. |41 
[Bar07] Jonathan Barrett, Information processing in generalized probabilistic theories, Phys. Rev. A 75 (2007), 

no. 3, 032304. \7 

[Bel64] John S. Bell, On the Einstein-Podolsky-Rosen paradox, Physics 1 (1964), 195-200. T3, 17 

[Ber61] C. Berge, Farbung von Graphen, deren sdmtliche bzw. deren ungerade Kreise starr sind, Wiss. Z. Martin- 

Luther-Univ. Halle- Wittenberg Math.-Natur. Reihe 10 (1961), no. 114. |34 
[Ber97] Claude Berge, Motivations and history of some of my conjectures, Discrete Math. 165—166 (1997), 

61-70. |46 

[BFRW05] Howard Barnum, Christopher A. Fuchs, Joseph M. Renes, and Alexander Wilce, Influence-free states 
on compound quantum systems, 2005. arXiv:quant-ph/0507108. f 11, 12 
[BL09] Jonathan Barrett and Matthew Leifer, The de Finetti theorem for test spaces, New Journal of Physics 
11 (2009), no. 3, 033024. fll 
[BPT11] Christine Bachoc, Arnaud Pecher, and Alain Thiery, On the theta number of powers of cycle graphs, 
2011. arXiv: 1103.0444. f43 

[Cab08] Adan Cabello, Experimentally Testable State- Independent Quantum Contextuality, Phys. Rev. Lett. 101 

(2008), no. 21, 210401. t5, 6 

[Cabl2a] , Twin inequality for fully contextual quantum correlations, 2012. arXiv:1209.0112. |43 

[Cabl2b] , A simple explanation of the quantum violation of a fundamental inequality, 2012. 

arXiv:1210.2988. f31, 43 

[Cabl2c] , Specker's fundamental principle of quantum mechanics, 2012. arXiv: 1212. 1756. |4, 29, 30 

[CDLTP12] Adan Cabello, Lars Eirik Danielsen, Antonio J. Lopez-Tarrida, and Jose R. Portillo, Basic logical struc- 
tures in quantum correlations, 2012. arXiv:1211.5825. 7 35 
[CEGA96] Adan Cabello, Jose M. Estebaranz, and Guillermo Garci'a-Alcaine, Bell-Kochen-Specker theorem: a proof 
with 18 vectors, Phys. Lett. A 212 (1996), no. 4, 183-187. f5, 6, 17, 22 
[CF12] Rafael Chaves and Tobias Fritz, Entropic approach to local realism and noncontextuality, Phys. Rev. A 
85 (2012), no. 3, 032113. |7 

[CGR03] B. Codenotti, I. Gerace, and G. Resta, Some remarks on the Shannon capacity of odd cycles, Ars 
Combinatoria 66 (2003), 243-257. |46 
[CHSH69] John F. Clauser, Michael A. Home, Abner Shimony, and Richard A. Holt, Proposed Experiment to Test 

Local Hidden-Variable Theories, Phys. Rev. Lett. 23 (1969), no. 15, 880-884. |5 
[CMW00] Bob Coecke, David Moore, and Alexander Wilce, Operational quantum logic: an overview, Current 

research in operational quantum logic, 2000, pp. 1-36. |3 
[CRST06] Maria Chudnovsky, Neil Robertson, Paul Seymour, and Robin Thomas, The strong perfect graph theo- 
rem, Ann. of Math. (2) 164 (2006), no. 1, 51-229. f35 
[CSW10] Adan Cabello, Simone Severini, and Andreas Winter, (Non-) Contextuality of Physical Theories as an 
Axiom, 2010. arXiv:1010.2163. f3, 5, 21, 27, 30, 35, 39, 43, 53 
[D'AIO] Giacomo Mauro D'Ariano, Probabilistic theories: what is special about quantum mechanics?, Philosophy 
of Quantum Information and Entanglement, 2010. fl2 
[EF70] J. Edmonds and D.R. Fulkerson, Bottleneck extrema, Journal of Combinatorial Theory 8 (1970), no. 3, 
299-306. t/6 



TOBIAS FRITZ, ANTHONY LEVERRIER, AND ANA BELEN SAINZ 



[Eit94 
[Eng97 

[FC12 

[Fek23; 

[Fin82; 
[FGR92 
[FNT12; 

[FR72; 

[FR81 

[Fril2 
[FSA+12 

[GHSZ90; 
[GLS01 
[Gre71 
[GP79 
[Haa96 
[Hae79; 
[Hae81 
[Henl2 

[ikoo; 

[JNP+11 
[Kar72; 

[KCBS08 

[Knu94 

[KR83; 



[KS67 
[Las02 



Thomas Eiter, Exact transversal hypergraphs and application to Boolean ^-functions, J. Symbolic Corn- 
put. 17 (1994), no. 3, 215-225. 38 

Konrad Engel, Sperner theory, Encyclopedia of Mathematics and its Applications, vol. 65, Cambridge 
University Press, Cambridge, 1997. |6 

Tobias Fritz and Rafael Chaves, Entropic Inequalities and the Marginal Problem, IEEE Trans. Inf. 
Theory (2012). arXiv:1112.4788. To appear. \7, 53 

Michael Fekete, Uber die Verteilung der Wurzeln bei gewissen algebraischen Gleichungen mit ganzzahli- 
gen Koeffizienten, Math. Z. 17 (1923), no. 1, 228-249. |46 

Arthur Fine, Hidden variables, joint probability, and the Bell inequalities, Phys. Rev. Lett. 48 (1982), 
no. 5, 291-295. \17 

D. J. Foulis, R. J. Greechie, and G. T. Riittimann, Filters and supports in orthoalgebras, Internat. J. 
Theoret. Phys. 31 (1992), no. 5, 789-807. |40 

Tobias Fritz, Tim Netzer, and Andreas Thorn, Can you compute the operator norm?, 2012. 
arXiv:1207.0975. f37, 38 

D. J. Foulis and C. H. Randall, Operational statistics. I. Basic concepts, J. Mathematical Phys. 13 
(1972), 1667-1675. f7, 18 

, Empirical logic and tensor products, Interpretations and foundations of quantum theory (Mar- 
burg, 1979), 1981, pp. 9-20. T10, 11 

Tobias Fritz, Tsirelson's problem and Kirchberg's conjecture, Rev. Math. Phys. 24 (2012), 1250012. f 21, 
22, 38 

Tobias Fritz, Ana Belen Sainz, Remigiusz Augusiak, Jonatan Bohr Brask, Rafael Chaves, Anthony 
Leverrier, and Antonio Acm, Local orthogonality: a multipartite principle for correlations, 2012. 
arXiv:1210.3018. f4, 5, 9, 30, 31, 32, 43, 55 

Daniel M. Greenberger, Michael A. Home, Abncr Shimony, and Anton Zeilinger, Bell's theorem without 
inequalities, Amer. J. Phys. 58 (1990), no. 12, 1131-1143. f22 

G. Gottlob, N. Leone, and F. Scarcello, Hypertree decompositions: A survey, Mathematical Foundations 
of Computer Science 2001 (2001), 37-57. fl8 

R. J. Greechie, Orthomodular lattices admitting no states, J. Combinatorial Theory Ser. A 10 (1971), 
119-132. T7 

Martin Grotschel and Manfred W. Padberg, On the symmetric travelling salesman problem. I. Inequal- 
ities, Math. Programming 16 (1979), no. 3, 265-280. |44 

Rudolf Haag, Local quantum physics, Second, Texts and Monographs in Physics, Springer- Verlag, Berlin, 
1996. f3 

Willem Haemers, On some problems of Lovdsz concerning the Shannon capacity of a graph, IEEE Trans. 
Inform. Theory 25 (1979), no. 2, 231-232. f46, 47, 48, 49 

, An upper bound for the Shannon capacity of a graph, Algebraic methods in graph theory, Vol. 

I, II (Szeged, 1978), 1981, pp. 267-272. f32, 33, 48 

Joe Henson, Quantum contextuality from a simple principle?, 2012. arXiv: 1210.5978. |4, 5, 29 
Wilfried Imrich and Sandi Klavzar, Product graphs: structure and recognition, Wiley-Intersciene, New 
York, 2000. \45 

Marius Junge, Miguel Navascues, Carlos Palazuclos, David Perez-Garci'a, Volkher B. Scholz, and Rein- 
hard F. Werner, Cannes embedding problem and Tsirelson's problem, J. Math. Phys. 52 (2011), no. 1, 
012102, 12. T21, 22 

Richard M. Karp, Reducibility among combinatorial problems, Complexity of computer computations 
(Proc. Sympos., IBM Thomas J. Watson Res. Center, Yorktown Heights, N.Y., 1972), 1972, pp. 85-103. 
T36 

A. A. Klyachko, M. A. Can, S. Binicioglu, and A. S. Shumovsky, A simple test for hidden variables in 
spin-1 system, Phys. Rev. Lett. 101 (2008), 020403-020406. T35, 40, 41 

Donald E. Knuth, The Sandwich Theorem, Electron. J. Comb. 1 (1994), Al. t34, 38, 43, 47, 49, 50, 51 
Richard V. Kadison and John R. Ringrose, Fundamentals of the theory of operator algebras. Vol. I, 
Pure and Applied Mathematics, vol. 100, Academic Press Inc. [Harcourt Brace Jovanovich Publishers], 
New York, 1983. Elementary theory. |25 

Simon Kochen and E. P. Specker, J. Math. Mech. 17 (1967), 59-87. f3, 17 

Jean B. Lasserre, An explicit equivalent positive semidefinite program for nonlinear 0-1 programs, SIAM 
J. Optim. 12 (2002), no. 3, 756-769 (electronic). f27, 36 



A COMBINATORIAL APPROACH TO NONLOCALITY AND CONTEXTUALITY 



59 



[Lau03] Monique Laurent, A comparison of the Sherali- Adams, Lovdsz-Schrijver, and Lasserre relaxations for 

0- 1 programming, Math. Oper. Res. 28 (2003), no. 3, 470-496. f37 

[Lov72] Laszlo Lovasz, Normal hypergraphs and the perfect graph conjecture, Discrete Math. 2 (1972), no. 3, 
253-267. |34 

[Lov78] , Kneser's conjecture, chromatic number, and homotopy, J. Combin. Theory Ser. A 25 (1978), 

no. 3, 319-324. f44 

[Lov79] , On the Shannon capacity of a graph, IEEE Trans. Inform. Theory 25 (1979), no. 1, 1-7. |47 

[LSW11] Yeong-Cherng Liang, Robert W. Spekkens, and Howard M. Wiseman, Specker's parable of the overpro- 
tective seer: a road to contextuality, nonlocality and complementarity, Phys. Rep. 506 (2011), no. 1-2, 

1- 39. T3, 7, 17, 30, 53 

[NP01] Denis Naddef and Yves Pochet, The symmetric traveling salesman polytope revisited, Math. Oper. Res. 

26 (2001), no. 4, 700-722. f44 
[NPA07] Miguel Navascues, Stefano Pironio, and Antonio Aci'n, Bounding the Set of Quantum Correlations, Phys. 

Rev. Lett. 98 (2007), no. 1, 010401. \5, 23, 25 
[NPA08] Miguel Navascues, Stefano Pironio, and Antonio Aci'n, A convergent ierarchy of semidefinite programs 

characterizing the set of guantum correlations, New Journal of Physics 10 (2008), no. 7, 073013. t5) 23, 

25, 27, 32 

[PMMF10] Mladen Pavicic, Brendan D. McKay, Norman D. Megill, and Kresimir Fresl, Graph approach to guantum 
systems, J. Math. Phys. 51 (2010), no. 10, 102103, 31. ]7 

[PMMM05] Mladen Pavicic, Jean-Pierre Merlet, Brendan McKay, and Norman D. Megill, Kochen-Specker vectors, 
J. Phys. A 38 (2005), no. 7, 1577-1592. |43 
[PNA10] Stefano Pironio, Miguel Navascues, and Antonio Ac'n, Convergent Relaxations of Polynomial Opti- 
mization Problems with Noncommuting Variables, SIAM Journal on Optimization 20 (2010), no. 5, 
2157-2180. T5, 23, 27 

[PR94] Sandu Popescu and Daniel Rohrlich, Quantum nonlocality as an axiom, Foundations of Physics 24 
(1994), no. 3, 379-385. fl4 

[RF73] C. H. Randall and D. J. Foulis, Operational statistics. II. Manuals of operations and their logics, J. 
Mathematical Phys. 14 (1973), 1472-1480. |7 

[Sch78] Thomas J. Schaefer, The complexity of satisfiability problems, Proceedings of the tenth annual ACM 
symposium on Theory of computing, 1978, pp. 216-226. |36 

[Sch03] Alexander Schrijver, Combinatorial optimization. Polyhedra and efficiency., Algorithms and Combina- 
torics, vol. 24, Springer- Verlag, Berlin, 2003. f36, 41, 43, 44 

[Sha56] Claude E. Shannon, The zero error capacity of a noisy channel, Institute of Radio Engineers, Transac- 
tions on Information Theory, IT-2 (1956), no. September, 8-19. |46, 49 

[Shu74] Frederic W. Shultz, A characterization of state spaces of orthomodular lattices, J. Combinatorial Theory 
Ser. A 17 (1974), 317-328. T9, 35 

[Spe60] Ernst Specker, The logic of non- simultaneously decidable propositions, 1960. Translation from the Ger- 
man original by M.P. Seevinck (2011), arXiv: 1103.4537. |29 

[Spe05] Robert W. Spekkens, Contextuality for preparations, transformations, and unsharp measurements, Phys. 
Rev. A 71 (2005), no. 5, 052108. T3, 6 

[ST96] K. Svozil and J. Tkadlec, Greechie diagrams, nonexistence of measures in quantum logics, and Kochen- 
Specker-type constructions, J. Math. Phys. 37 (1996), no. 11. \7 

[Svo93] Karl Svozil, Randomness & undecidability in physics, World Scientific Publishing Co. Inc., River Edge, 
NJ, 1993. T38 

[Tar51] Alfred Tarski, A decision method for elementary algebra and geometry, University of California Press, 

Berkeley and Los Angeles, Calif., 1951. 2nd ed. |37 
[TkaOO] Josef Tkadlec, Diagrams of Kochen-Specker type constructions, Internat. J. Theoret. Phys. 39 (2000), 

no. 3, 921-926. Quantum structures '98 (Liptovsky Jan). \7 
[Tsi93] Boris S. Tsirelson, Some results and problems on quantum Bell-type inequalities, Hadronic Journal 

Supplement 8 (1993), 329-345. fl4 
[Vol09] Vitaly I. Voloshin, Introduction to graph and hypergraph theory, Nova Science Publishers Inc., New York, 

2009. |43 

[Vor62] N. N. Vorob'ev, Consistent Families of Measures and Their Extensions, Theory of Probability and its 
Applications 7 (1962), no. 2, 147-163. |53 



TOBIAS FRITZ, ANTHONY LEVERRIER, AND ANA BELEN SAINZ 



[Wil05] Alexander Wilce, Topological Test Spaces, International Journal of Theoretical Physics 44 (2005), no. 8, 
1227-1238. fl5 

[Wil08] , Formalism and Interpretation in Quantum Theory, 2008. A slightly edited version of a paper 

to appear as part of a Festchrift for Jeff Bub. |7, 12, 40 
[Wil09] , Test spaces, Handbook of quantum logic and quantum structures — quantum logic, 2009, pp. 443- 

549. f3, 7 

[Wri78] Ron Wright, The state of the pentagon: a nonclassical example, Mathematical foundations of quantum 
theory (Proc. Conf., Loyola Univ., New Orleans, La., 1977), 1978, pp. 255-274. |40 

[Zeh06] H. Dieter Zeh, Quantum nonlocality vs. Einstein locality, 2006. 

http:/ /www. rzuser.uni-heidelberg.de/~as3/nonlocality.html. f3 

[Zie95] Giinter M. Ziegler, Lectures on Polytopes, Graduate Texts in Mathematics, vol. 152, Springer Verlag, 
New York, 1995. f9 



A COMBINATORIAL APPROACH TO NONLOCALITY AND CONTEXTUALITY 



61 



A2.1(a) < AAA > AA.2(a) 7 ' 5 ' 2 > 7.5.1(a) 



A.2.2 



AAA 



7.5.2 



A.2A(b) < AAA > AA.2{b) 7 - 5 " 2 > 7.5.1(6) 



A2.2 



AAA 



A2.1(c) 4 AAA > AA.2{c) 



AA.Q 



A4.5 



7.5.2 




7.5.1(c) 



Figure 12. Our conjectures and the known implications between them. 
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